- Taboola Blog
- Java
In this article you will learn what Samplex is and how it is used to make processing of large raw datasets more efficient.
This post is not about K8S – nor is it about AWS. It is not about containers – nor is it about some new, “cool” technology for managing large-scale applications. Rather, this post is about how we deploy a highly sophisticated Java service, a heavy service that is very actively developed on a daily basis, to 1000s of servers across our 7 data centers around the world. So what’s the problem? Isn’t it enough to take a list of servers, get the version to deploy and run it with an automation tool like ansible? Well, it’s not as simple as it might seem. This service serves Taboola’s recommendations and responds to hundreds of thousands requests per second. The service has to be fast – so fast that its p95 should be below 500 milliseconds per request. Which means we can’t have any downtime at all, or even afford slower […]
Optimizing Spark Executor Utilization: Harnessing Dynamic Allocation and Resource Management for Efficient Workload Processing.
Introduction Newsrooms are under constant pressure to deliver the most up to date, relevant, and engaging information possible. At Taboola, we are building tools to make this faster, easier, and now–predictable. As soon as an article is published the team has a critical eye on engagement data. Garnering insight on article performance as soon as possible is critical for guiding content strategy. Some articles receive wide attention immediately, drawing hundreds of thousands of page views within minutes, others may only see the first page view after a few hours. Taboola aims to narrow this gap even further by leveraging Machine Learning Models to predict article performance the moment after it becomes available to the reader. Read on for details on our latest research and fascinating discoveries around predicting article performance! Article Data Taboola Newsroom is a real-time optimization technology that empowers editorial teams with actionable data around what stories, headlines, […]
As a content discovery product, we need to be able to pace the campaign through its life on real time – spending the budget entirely without overspending. The team I am leading is responsible for serving Taboola’s video content. Our main goal is to enable growth of our business. Owning the entire serving process of the video content can be crucial to this end. Depending on 3rd parties serving systems with the core process would leave us vulnerable to rising prices, compromised features, serving latency, reduced performance and so on. In order to serve the video on our own we had to come up with a way to pace the amount of times we want to display the video and prevent instances in which the entire budget of the video would drain in a few seconds – common scenario in Taboola’s scale when the video is not targeted aggressively. We […]
A couple of months ago my team had its first experience working with Java fibers, we needed to make our main application work asynchronously. In this 3 part series, I will share my team’s experience and how we deploy and implement Java fibers in production. In Part 1 we talked about what fibers are in high level, how they compare to threads and why we started to explore them. In Part 2 we went further in-depth about how fibers differ from threads, how to create fibers, how to work with them and the basic concepts of how they work. In this part, we’ll discuss what’s going on under the hood in fibers and deep dive into the implementation of how fibers work and what lessons we learnt during our journey working with them. We will also see how this magic happens… Under the hood Fibers are implemented by instrumenting […]
A couple of months ago my team had its first experience working with Java fibers, we needed to make our main application work asynchronously. In this 3 part series, I will share my team’s experience and how we deploy and implement Java fibers in production. In the previous part (Part 1), we talked about what fibers are in high level, how they compare to threads and why we started to explore them. In this part we’ll focus further in-depth about fibers and how they differ from threads, we’ll see how to create fibers, how to work with them, and the basic concepts of how they work. Threads vs. Fibers We searched for a reason why not to stay with threads. We researched the costs and performance penalties of working with threads vs. fibers. We wanted to find proof that fibers can work better than threads, or at least shine in […]
A couple of months ago my team had its first experience working with Java fibers, we needed to make our main application work asynchronously. In this 3 part series, I will share my team’s experience and how we deploy and implement Java fibers in production. We will cover what fibers are, how to use them, their pros and cons, and their internals, all in a mix between guide and blog describing our experience. Fibers are a sort of lightweight threads, that are meant to address performance, scale and code structure in our applications, they can work together or replace threads. If you are dealing with concurrency, code structure and asynchronous challenges, or you are just interested in learning this technology, this blog post series is for you. The first part of this series is an overview of what fibers are, the next parts are diving deeper into the technology and […]
Prioritizing Kafka Topic Consumption: How I Developed a Mechanism to Optimize Message Handling. Discover how to handle messages efficiently.
Regression Testing a Complex UI Web applications are a mischievous bunch. When they are born they are usually small, clean and orderly, but they may grow up to become complex and error prone monsters. Each feature added to the mix increases the chance of a new bug appearing, a promise that is usually fulfilled. When developing a large and complex web application, we need to be able to continually check regressions and verify that everything that worked until now is still working. So that at least we won’t break more than we need to. Taboola is are a web company, and must deliver on a very rapid pace. We ship many features on a continuous basis. Under such circumstances no matter how big your QA team is, it will never manage to cover the entire system for every delivered feature. This means an extensive testing automation framework is a must. […]