Perl expert with many CPAN contributions including: OOPS, Stream::Aggregate, IO::Event, Daemon::Generic, and Test::MultiFork.
Designed and implemented distributed aggregation framework which processed over 4 billion web pages per run generating a mysql database used for research and search-relevance features for more than 50 million domain-path pairs used to help rank search results (Searchme).
Led a small team to design and implement a general content management system for a new academic publisher that is still used and evolving seven years later (Bepress).
Designed and implemented a parallel distributed log processing framework used to produce daily logs and business intelligence data used by executives (Searchme).
Using a combination of techniques, achieved effective victory over inbound and outbound email spam as measured by the number of complaints about false positives, false negatives and outbound email blockages (Idiom).
Member of the Site Reliability Engineering team supporting Google’s internal customer relationship manager tool.
Part of the team of engineers maintaining the existing internal product packaging system and designing its evolution and replacement.
A key member of a team of researchers improving search results. My projects included Special Features Processing (XML XPath and perl fragment fault tolerant feature extraction framework); Media SFP (feature extraction from youtube, flickr, etc) & Torgo (Search results scraping); Alcatraz (manual category overrides); Log processing (built my own Hadoop-like framework to partition data for parallel processing); Docstore Aggregation; and a control framework for docstore augmentation systems. I also taught perl programming and advised many others to help with their projects.
I managed the systems, networks, people and money for a small Internet Service Provider. We offered a full range of ISP services including email accounts, PPP dialup, virtual hosting, DSL, T1s, T3s, colocation, and even wholesale DSL for other ISPs. The billing and customer care software I built support a superset of the legacy practices of each of the ISPs we acquired. We created two brands and purchased four brands. My major technical projects included: the billing system and customer care software; mail system configuration and programming; inbound and outbound spam filtering; the backup system; billing system data imports for purchased brands; email configuration data imports for purchased brands; the multi-brand IVR/call center using asterisk; security infrastructure; and various complicated installations and upgrades over the years. Two of the purchased brands were integrated transparently so that the customers were mostly not aware of the transition. After achieving substantial profitability, I arranged an asset sale so that I could focus on my next activity without the distraction of running my own business.
Along with three Berkeley law and econ professors, I co-founded Bepress. I designed and built the underlying web toolkit and on top of it, the content management system to support the academic publishing workflow. My Perl toolkit included: transaction-safe object persistence; a context-sensitive web template language; user authentication and management; authorization, roles and permissions; web-based recipient-friendly mass mailer; strong separation of roles between web design and programming; a web-site structure editor; and a template editor. I supervised several programmers and other staff. Bepress continues today as a successful academic publisher.
I worked on two projects for Inktomi: in perl, I built part of the infrastructure for analyzing the logs for the HotBot search engine; later, in C++, I built the log-generation module for the Traffic Server. The development environment for the Traffic Server made this challenging: everything was done with threads that were not allowed to block and must keep their state in the heap.
At Comdisco, I ported our 3.8 million lines of C from VAX/VMS, Apollo, and Sun-3 to the Sun-4 and DECstation; released three major versions of our software product; recruited and trained two system administrators; built systems administration tools: the backup system and an automount automator; built the installation system for our products (perl); and first maintained, and later rebuilt, our development environment including the version control system.
I built a distributed time-synchronization system in java (NextBus Information Systems).
I ported applications (C++); debugged imake configurations; and enhanced the development environment (Teknekron Communications Systems).
I made our networked currency trading application (C++) resilient to failure: most system failures are automatically handled and the worst are detected and recovered with a restart of just the failed component (Berkeley Research and Trading).
I built a web-based mass mail sender with bounce tracking and reliable unsubscribe (Internet Profiles, Inc).
I built an executive availability tracker (Well Fargo Bank).
I did internal information infrastructure programming for customer care, setup, provisioning, and billing (Best Internet).
I wrote an X11 font converter, debugged our MACH distribution (TRW Financial Systems).
My largest project as a student was release engineering for 60+ releases of UCSD Empire (a text-based multi-player real-time world simulation game).
My student jobs included programming for the Postgres Project, systems administration for Franz Inc, and systems administration for the Institute of Cognitive Studies.
A patent that covers probabilistic text indexing. Useful for duplicate detection for spam filtering, plagiarism detection, and results filtering.