I first publicly wrote about Anzwix back in 2011 as the back-end system to what makes most of Phoronix happen. Anzwix is what finds for me most of the interesting news stories and open-source events on a daily basis and in a very efficient manner so that I'm able to near single-handedly run Phoronix.com while remaining very competitive with much larger staffed news sites.
Anzwix continues to be a large work-in-progress that's constantly evolving to adapt to new content sources from monitoring dozens of Git repositories to project mailing lists to blogs to social networking feeds while systematically analyzing the potential interest level of a given piece of content / event.
Here's the tentative about page for the soon-to-launch Anzwix:
Anzwix is an experimental project led by Michael Larabel, the founder of Phoronix Media and lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org open-source automated testing / benchmarking products, among involvement in other Linux and open-source projects.
What Is Anzwix?
Anzwix serves as a simplified public front-end to an automated news classification and analytics system developed by Michael Larabel over his tenure of serving as the editor of Phoronix. Over nine years, Larabel has been single-handedly responsible for writing over 2,400 featured length articles and more than 9,100 news articles. The back-end system to Anzwix is what has effectively allowed Michael to publish news promptly at an expedited rate and to monitor the diverse range of open-source software projects covered by Phoronix and help in delivering hundreds of exclusive stories on Phoronix.
Originally the news aggregation process at Phoronix involved subscribing to hundreds of different RSS feeds for version control systems and subscribing to countless mailing lists, project blogs, social networking accounts, etc. Over time, the back-end system to Anzwix was developed for systematically analyzing mailing list, DVCS activity, and other content sources for narrowing down the field of potentially interesting events in real-time.
Seeing a code commit to update translations or fixing a small bug isn't interesting, but if the commit or patch ups the application's performance greatly, adds in a major new feature, or has some other great trait, it's likely worth writing about and exploring further. Long story short, it can be viewed as a sort of "priority filter" for reading about all of the latest upstream open-source development advancements as they happen. he back-end is modeled around concepts from experience in algorithmic (HFT) trading and data intelligence platforms.
Speeding Up The News Process
Anzwix has made it possible to go from needing to read over one thousand mailing list messages, code revisions, and other data points per day to only having to deal with a few dozen or so, which the system marks as likely being the most interesting. The system determines a weight for each data point based upon keywords found within the message, size and other attributes of any contained patches, the author/sender and his/her automatically determined importance within a given project, and a variety of other factors. Going even further, there's cross-referencing related messages/commits against how much traffic they have generated in the past on Phoronix via integration with its PHXCMS content management system. There are a whole lot of other factors too that weight a mailing list post or code commit to try to autonomously determine its importance and potential value of interest. It's been developed over several years and has evolved into quite a unique and catered system.
The Public Anzwix
With Anzwix being designed around the interests and focus of Phoronix, the system slants towards information that is of interest to Linux/open-source/hardware enthusiasts, especially as it pertains to new features/functionality and performance. The system also enjoys potentially flagging content that's potentially polarizing, of interest to the mass desktop user-base, or is a structural change to the project. The Anzwix hit-rate isn't perfect and there are some false positives, but it reduces a clear majority of unrelated and uninteresting noise from mailing lists, Git repositories, and other sources.
By making the front-end publicly accessible, the goal is to make Anzwix cover more open-source software projects beyond just the focus of what's covered on Phoronix. Other mailing lists and Git repositories have been added to Anzwix.com for automatically covering. There's also a stronger news flow for those that may be more interested in some projects more than others will be able to tune their news flow through Anzwix. There will still be the Phoronix.com news, but Anzwix is the frontlines for knowledgeable enthusiasts and those Linux/open-source users living on the bleeding-edge without any bias or drama. Additionally, Anzwix.com wants to be a better mailing list archive and Git commit viewer than other publicly existing solutions.
The public version of Anzwix.com is simplified compared to the advanced version used internally at Phoronix, which also allows monitoring Planet Blogs, bug tracking systems, Wiki systems, and other web-sites as potential news feeds serving this centralized and uniform system. There are also performance dashboards and other metrics exposed internally. Experiments will be conducted on Anzwix.com over the coming months to determine what is worth opening up to the public, what code might be open-sourced, and other measures taken for enriching this unique web-portal. Your feedback is welcome by contacting Michael Larabel.
I'm looking at offering an early sneak-peak of Anzwix to Phoronix Premium subscribers by the end of next week while general public access should happen by the end of August.