Fedora Making Progress On New Privacy-Minded System For Counting User Statistics
Earlier this month there was a change proposal announced that would give Fedora system's a new unique UUID tracking identifier to count systems. The intention isn't to track users but rather to provide more statistics about the Fedora install base compared to the current system that is just tracking unique IP addresses, but a revised proposal would improve the privacy while still offering up much of the same statistics potential.
Rather than relying upon a unique identifier that is transmitted to the Fedora update servers, the revised proposal is focusing upon just transmitting the "variant" (indicating if you are running Fedora Workstation or one of the other spins) and then a new "countme" variable. That countme variable would be managed client-side and under current thinking would increment weekly to reflect the age of the Fedora system: that would allow Fedora to see the age of the systems, new vs. updating installs to new releases, the number of users just running in Docker / cloud / other short-lived instances, and other metrics but without relying upon a per-system UUID.
That countme variable stored locally could obviously be manipulated by the end-user, but it's more privacy-minded and hopefully not a cause for concern by users. The latest revision of this proposal can be found on the Fedora Wiki.
This FESCo ticket is also where there have been on-topic discussions happening about this effort to gather better installation statistics. Fedora is focused on getting an accurate look at the number of installs, how many of those installs are long-term installations, the most-used Fedora variants, and the impact on any short-term spins over the longer-term. With an incrementing "countme" plan, it would also allow seeing how quickly users shift to new Fedora releases, how many systems are upgraded for every release, and related metrics.
Fedora's Matthew Miller also shared their current mirror statistics as pictured above. Based on this current methodology, they see Fedora 29 is around about 40% higher than the Fedora 28 peak, which they believe is too good to be true. So while they would be happy seeing 200,000+ unique IPs per day on the latest release and remarkable growth, they think the current IP-based counting is flawed and would like a better solution.
Rather than relying upon a unique identifier that is transmitted to the Fedora update servers, the revised proposal is focusing upon just transmitting the "variant" (indicating if you are running Fedora Workstation or one of the other spins) and then a new "countme" variable. That countme variable would be managed client-side and under current thinking would increment weekly to reflect the age of the Fedora system: that would allow Fedora to see the age of the systems, new vs. updating installs to new releases, the number of users just running in Docker / cloud / other short-lived instances, and other metrics but without relying upon a per-system UUID.
That countme variable stored locally could obviously be manipulated by the end-user, but it's more privacy-minded and hopefully not a cause for concern by users. The latest revision of this proposal can be found on the Fedora Wiki.
This FESCo ticket is also where there have been on-topic discussions happening about this effort to gather better installation statistics. Fedora is focused on getting an accurate look at the number of installs, how many of those installs are long-term installations, the most-used Fedora variants, and the impact on any short-term spins over the longer-term. With an incrementing "countme" plan, it would also allow seeing how quickly users shift to new Fedora releases, how many systems are upgraded for every release, and related metrics.
Fedora's Matthew Miller also shared their current mirror statistics as pictured above. Based on this current methodology, they see Fedora 29 is around about 40% higher than the Fedora 28 peak, which they believe is too good to be true. So while they would be happy seeing 200,000+ unique IPs per day on the latest release and remarkable growth, they think the current IP-based counting is flawed and would like a better solution.
8 Comments