CTOs to Know: Meet data.world's Bryon Jacob

Former senior architect at HomeAway joins elite crew of Austin data executives to launch data.world this year. See what Bryon Jacob has learned along the way and the technology he advocates for.

Written by Kelly O'Halloran
Published on Sep. 15, 2016
CTOs to Know: Meet data.world's Bryon Jacob
Brand Studio Logo

With a "dream team" of Austin entrepreneurs, data.world launched earlier this year from an elite team of experienced data experts to provide one platform, like a social network, for data geeks to locate, share and analyze open data sets. 

The founding group consists of CEO Brett Hurt, former Bazaarvoice cofounder, COO Matt Laessig, from Bazaarvoice and HomeAway, CPO Jon Loyens, former HomeAway VP of engineering, and CTO Bryon Jacob, former principal architect at HomeAway. 

As CTO, Jacob has worked at both Triology and Amazon. He shared with us the challenges of playing an intricate role in addressing 30 acquisitions during HomeAway's growth period, the importance of Austin's tech network, and why he looks for an ownership mentality in his team.

What technologies power your business?

For applications and services with long-running processes, we containerize everything into Docker containers during our build process, and use docker hosts to manage the processes in all of our environments. The applications themselves are split between a Node.js/React and Redux stack for everything with a user interface, and Java/Scala for our data processing pipeline and state-management APIs. The team we've hired assessed that these are the best-of-breed solutions in those two areas, and there's no shortage of available talent for either of those stacks in Austin, so we are prepped to grow with those technologies.

We run everything we do on AWS, and make use of managed services wherever we can . We use DynamoDB, Kinesis, and Simple Workflow Service, for example. These kinds of technologies are huge accelerators at an early stage. There are great open and portable solutions for these things, but there's a real cost to getting up and running with those technologies and managing the operations of those clusters. It's a huge benefit to be able to defer those costs and focus engineering time on building the core competencies of our platform. And if you're architecting your systems right, there's not a real issue of "vendor lock-in," I don't think it would cost us any more to replace our usage of DynamoDB with Cassandra, or of Kinesis with Kafka, than it would cost to migrate a running Cassandra or Kafka cluster to a new hosting provider. These things are pretty decoupled from our application code, and that frees us to make intelligent decisions about how we prioritize precious engineering time.

What technologies are playing the biggest roles at your company this year?

One place where we made a big investment that I'm particularly proud of is in building out our CI/CD pipeline. All of our code, from the infrastructure code that provisions resources, to the core data processing pipeline, to the front-end web UI, is continuously built, deployed, tested, and can be shipped to production with a git command.  We've got a complex system with a number of moving parts, and it's something that took a little time to orchestrate up front, but it's paid for itself already in my opinion. There's nothing like being able to ship quickly and confidently as soon as a feature is ready.

What are the biggest tech projects your team is working on this year?

We just launched our preview release in July, and we've had an amazing response so far.  It's been an interesting challenge, because we're really dedicated to lean and agile methodologies and building things in a very active, iterative conversation with our users But for a platform like data.world, the MVP is pretty meaty and complex, and we had to build a significant amount of tech in order to have the raw material to iterate quickly on features. 

One set of technologies that is core to what we do is the Semantic Web - RDF, triple stores, and SPARQL. These are W3C-standardized technologies that have been around for a while, and have gotten some great use in large organizations that are focusing on deep graph analytics, but haven't really been democratized as a mainstream tool for data management.  We hope to change that, and a lot of our work goes into going deep on those technologies, and then extracting the essence into features that are closer to the language of most data professionals. It's a huge project, but we're already seeing evidence that it's worth it. Our focus on Linked Data has enabled us to build some very complex features that I don't think we'd have yet otherwise.

What are the biggest technology challenges you’ve faced in the past? How did you overcome them?

I was at HomeAway for 10 years, from pretty close to the beginning through the acquisition this year by Expedia. During that time we integrated over 30 companies, and their employees, and their technology. And it wasn't like we were an established company with a robust platform as a starting point, where we could just migrate these companies into the fold. We were figuring out the business, assimilating knowledge from the acquisitions, and designing and deploying our global platform as we went. From a technology POV, what you learn in that process is how important having a flexible, open technology architecture is. HomeAway was already very much a startup with all the curveballs you'd expect trying to figure out your market and fit your product to it. When you're acquiring new companies multiple times a year, you have to juggle robustness and the continuity of the businesses with support for feature development.

One example of a technology decision we made early on that was a huge force multiplier was the HomeAway Search Service. It was a syndication and search service over listing data that came from any of our individual brands' databases.  The service was based around Lucene with a custom orchestration layer for clustering servers (this was before ElasticSearch or SOLR), and it allowed us to bring in a new listing database by writing a simple publishing adapter that could poll the database for changes to listing content and push the document into the Lucene indexes.  It solved the business problem of syndicating content from many listing databases onto the homeaway.com website, and it proactively addressed runtime scale by moving almost all of our read traffic out of the transactional database.  Most of the companies that HomeAway acquired in the early days were pretty simple 2-tiered web applications that hadn't really been architected for internet-scale search traffic, some weren't built for read/write replication, for example.  While over time we eventually rebuilt all of that to be a very scalable data layer, we were able to postpone it and focus on features for years because that search service took 99 percent of the read traffic off of the databases, allowing them to focus on handling writes.  Around 2013, I think, all of the orchestration code from that original service was replaced with Elastic Search, which was a far more robust and solid solution for managing a large search cluster - but having a service like that as part of our architecture from early on was a technology that really fueled our growth.

While it was really pronounced in the HomeAway story where we acquired so much outside tech, I think these lessons are really important for any tech entrepreneur. There's always going to be curveballs and changes in direction, the role of architecture in software is to be able to respond to the things you can't predict and have a system where new things can "plug in" to your existing infrastructure with minimal disruption to what already works.

What are lessons you’ve learned about working in Austin that other local entrepreneurs can learn from?

Austin's tech community is hyper-connected, way more so than Silicon Valley. This confers a natural advantage to folks with deeper roots in town, but I've also seen a lot of evidence that the networks here are easier to penetrate than in other tech hub cities.  There's an explicit drive to support and foster the tech community among the established players. One example is what Josh Baer does with Capital Factory. He has a monthly "ask me anything" meetup where he gives an introduction to the local scene and then fields questions. I've known a number of entrepreneurs who got their introduction to Austin doing just that, and it's helped them build out strong networks. Getting involved at the various incubators, accelerators, and meetup groups in town is something anyone can do and it makes a big difference.

Austin is known for having a large talent pool of thirsty, young workers. What are the top characteristics you look for in a potential hire?

The number one thing I look for is an ownership mentality. I want to know that a candidate is the sort of person who is going to assert themselves on their work, be opinionated, and drive it forward. I think a company culture where everyone feels responsibility for and pride in their work is the most important thing - and that's the sort of thing that's harder to teach.  This manifests differently in senior versus junior employees. For senior hires, I'm looking for lots of tangible evidence that you've been a leader on your past projects, and for junior hires I'm looking for a critical eye, an analytical approach to the performance of their work and a desire to be a leader.  And "leader" doesn't mean manager or even a formal designation like "tech lead" - a lot of times the leaders on an engineering team are just the ones who push things like best practices in testing and code reviews, and hold their peers accountable. The ones who push the team to "do the right thing" rather than waiting for someone else to define the work.

How would your team describe working with you?

I think they would describe me as a collaborator. I always want to work with a group of self-motivated, confident people, and that's definitely the kind of team we've built at data.world. I want to work with people who are great at their job and know they are. So there's a little bit of swagger, but who are also humble and want to work together and bring everyone's ideas to the table. So I view my job as keeping an eye out on the big picture, advocating for technology to the rest of the company and externally, and looking ahead for where we need to be steering over the long run. When I'm working with the team on architecture, design, or planning, I think I'm mostly another voice in the room and I'd be very happy if my team felt that way, too.

 

 

Image provided by data.world. Some answers have been edited for clarity and length.

Want to nominate a CTO for this series? Tell us or tweet @BuiltInAustin.

Hiring Now
Spectrum
Information Technology • Internet of Things • Mobile • On-Demand • Software