- Taboola Blog
Something strange happened while I worked with Kafka. While adding a new consumer from Kafka to one of our services, the service stopped consuming from ALL other existing consumers. As part of my job at Taboola as a team leader on a production team in the Infrastructure group, we’re supposed to remove bottlenecks, not create them. This post will describe how I investigated the issue, explain what I discovered, and share my insights into the whole situation. Some background Before I get into the rest of the story, here’s some background on how we use Kafka at Taboola’s events handling pipeline and why it’s critical to our infrastructure. Taboola’s recommendations appear on tens of thousands of web pages and mobile apps every second. As users engage with the content, multiple events are fired to signal that recommendations are rendered, opened, clicked, and so on. Each event triggers one or more Kafka messages, […]
Find out the secrets to how Taboola deploys and manages the thousands of servers that bring you recommendations every day.
Many R&D buzzwords and acronyms can seem like complex jargon — unnecessary shortcuts for concepts that are already pretty basic.
During the pandemic, most companies quickly adapted and moved to a work-from-home model, as a sudden necessity of the lockdown restrictions introduced by efforts to combat the spread of COVID-19.
We announced our plans to acquire Connexity, bringing eCommerce recommendations to the open web. And this is just the beginning.
The world is not flat, it’s highly nested With over 4 billion page views per day and over 100TB of data collected daily, scale at Taboola is no joke. Our primary data pipe deals with masses of data and endless read paths. Could we optimize our schema for all these read paths? Guess not… Our schema is HUGE and highly nested. After digesting the data, we keep it in hourly Parquet files on HDFS, where each hour consists of about 1-1.5TB of compressed data. Our schema roughly looks like this: root |– userSession: struct | |– maskedIp: long | |– geo: struct | | |– country: string | | |– region: string | | |– city: string | |– pageViews: array | | |– element: struct | | | |– url: string | | | |– referrer: string | | | |– widgets: array | | | | |– element: […]
You wrote your code. You even tested it. And now, you are eager to git push it. But how can you verify that it really works? In Taboola, we test our code in production! In this article, you will see how every software engineer, even on the first day in the company, can test in production – all thanks to a dedicated Jenkins pipeline job and lots of metrics. How hard is it to test in production? Quite hard. You probably already knew that. Everybody fears that moment when they need to test changes in production. The main reason is that not everyone has the required IT skills. Moreover, people have to repeat error-prone, manual tasks – which might result in downtime and revenue loss. For our release engineers, it was also an unmanageable headache – a “thundering herd” of developers eager to test their features in production. […]
Have you ever tried building an infrastructure to upload 150TB a day? Have you ever tried querying over 13PB without going bankrupt? These are some of Taboola’s PV2Google (pageviews to Google) service scale challenges that we deal with in our day to day. In this blog series, we’ll share how we do it, and the challenges we face. In this article (part 1) we’ll focus on the architecture. Part 2 covers the lessons we’ve learned over the years. Hello, Pageviews! Taboola’s goal is to power recommendations for publishers and advertisers. Our platform serves over 360 billion content recommendations and processes over two billion pageviews a day. Pageview is a record describing recommendations, user activity (such as a click), and much more on a user’s visit to a webpage. Currently, the pageview record has about 1,000 fields. Two billion pageviews generate a huge amount of data. This data is processed […]