- Taboola Blog
- Engineering
“Any fool can write code that a computer can understand. Good programmers write code that humans can understand.” Martin Fowler, 2008. Names, they are everywhere in our software. Just think of the things we name, we name our packages, classes, methods, variables, in fact us programmers do so much of it, we should probably know how to do it well. In my opinion, making the code readable is just as important as making your code work. In this post I will give you 5 tips and guidelines to choose your names in order to make your code more readable. 1. Reveal your intent: The name you choose should answer as many questions as possible for the reader, questions like, why it exists, what it does, and how it is used. Choosing good names takes time but saves more than it takes when the going gets tough, so […]
Some of the problems we tackle using machine learning involve categorical features that represent real world objects, such as words, items and categories. So what happens when at inference time we get new object values that have never been seen before? How can we prepare ourselves in advance so we can still make sense out of the input? Unseen values, also called OOV (Out of Vocabulary) values, must be handled properly. Different algorithms have different methods to deal with OOV values. Different assumptions on the categorical features should be treated differently as well. In this post, I’ll focus on the case of deep learning applied to dynamic data, where new values appear all the time. I’ll use Taboola’s recommender system as an example. Some of the inputs the model gets at inference time contain unseen values – this is common in recommender systems. Examples include: Item id: each recommendable item gets […]
If you are using web cookies to operate your online business you probably know already that just like in real life, cookies do not last long. This is an especially known fact to whoever uses online cookies to store unique user IDs. Most online marketing companies rely on cookies for that purpose, but when cookies disappear – it makes it harder for them get persistent user data. Interested to know for how long does a cookie really last? in this post I’ll try to provide some answers. Who is eating web cookies? Cookies can disappear for various reasons, such as: Clearing the browser historical data by the user Setting the browser to reject third-party cookies Using tools that clean up your device and free up storage space Use of VPNs, Ad Blockers and more. One very common reason cookies disappear is the use of private browsing modes such as Incognito […]
In the last couple of years deep learning (DL) has become a main enabler for applications in many domains such as vision, NLP, audio, click stream data etc. Recently researchers started to successfully apply deep learning methods to graph datasets in domains like social networks, recommender systems and biology, where data is inherently structured in a graphical way. So how do Graph Neural Networks work? Why do we need them? The Premise of Deep Learning In machine learning tasks involving graphical data, we usually want to describe each node in the graph in a way that allows us to feed it into some machine learning algorithm. Without DL, one would have to manually extract features, such as the number of neighbors a node has. But this is a laborious job. This is where DL shines. It automatically exploits the structure of the graph in order to extract features for […]
Ever thought about presenting your work to others? Talking in a meetup or a conference? In the past I couldn’t even think about it, I thought that it’s not for me and I won’t get any benefit from it at all. In the last year and a half, things have started to change. In the following post I will share how the will for continuous improvement took me out of my comfort zone, and put me in places and scenarios I never imagined. I started my journey in the software development world 8 years ago. I had some knowledge, and almost no experience. I studied industrial engineering and didn’t think I would practice software development. But things changed and I found my first role as a manual QA engineer, then QA automation engineer, automation developer, and in the last 5 years DevOps / Release engineer. I was always […]
We all have these amazing machines in our development and testing labs, and we know that our real users do not share this wonderful world. They experience our products very differently from us. These differences result in two major challenges: We do not know what the users experience We cannot debug their machines As a Video Advertisement Player team, these challenges are multiplied. Why? Our product is a third party script that serves other third party scripts for websites. Your code runs on different platforms As a third party web product, you do not know which websites your code runs on. Websites have a variety of frameworks, architectures and styles. Frameworks – change the browser’s core behavior, for example, redefining methods, which challenges the product’s basic behavior. Architectures – affect the website’s performance, which impacts on the product’s natural flow. Styles -manipulate the product’s look and feel. Running […]
A couple of months ago my team had its first experience working with Java fibers, we needed to make our main application work asynchronously. In this 3 part series, I will share my team’s experience and how we deploy and implement Java fibers in production. In Part 1 we talked about what fibers are in high level, how they compare to threads and why we started to explore them. In Part 2 we went further in-depth about how fibers differ from threads, how to create fibers, how to work with them and the basic concepts of how they work. In this part, we’ll discuss what’s going on under the hood in fibers and deep dive into the implementation of how fibers work and what lessons we learnt during our journey working with them. We will also see how this magic happens… Under the hood Fibers are implemented by instrumenting […]
A couple of months ago my team had its first experience working with Java fibers, we needed to make our main application work asynchronously. In this 3 part series, I will share my team’s experience and how we deploy and implement Java fibers in production. In the previous part (Part 1), we talked about what fibers are in high level, how they compare to threads and why we started to explore them. In this part we’ll focus further in-depth about fibers and how they differ from threads, we’ll see how to create fibers, how to work with them, and the basic concepts of how they work. Threads vs. Fibers We searched for a reason why not to stay with threads. We researched the costs and performance penalties of working with threads vs. fibers. We wanted to find proof that fibers can work better than threads, or at least shine in […]
A couple of months ago my team had its first experience working with Java fibers, we needed to make our main application work asynchronously. In this 3 part series, I will share my team’s experience and how we deploy and implement Java fibers in production. We will cover what fibers are, how to use them, their pros and cons, and their internals, all in a mix between guide and blog describing our experience. Fibers are a sort of lightweight threads, that are meant to address performance, scale and code structure in our applications, they can work together or replace threads. If you are dealing with concurrency, code structure and asynchronous challenges, or you are just interested in learning this technology, this blog post series is for you. The first part of this series is an overview of what fibers are, the next parts are diving deeper into the technology and […]
About a year ago we incorporated a new type of feature into one of our models used for recommending content items to our users. I’m talking about the thumbnail of the content item: Up until that point we used the item’s title and metadata features. The title is easier to work with compared to the thumbnail – machine learning wise. Our model has matured and it was time to add the thumbnail to the party. This decision was the first step towards a horrible bias introduced into our train-test split procedure. Let me unfold the story… Setting the scene From our experience it’s hard to incorporate multiple types of features into a unified model. So we decided to take baby steps, and add the thumbnail to a model that uses only one feature – the title. There’s one thing you need to take into account when working with these […]