What is cross-site tracking and how to prevent it

Author pic

Written By - Garvit Maloo

11 February, 2024

Ever happened that you have been surfing on a website and suddenly you see suggestions of products you recently explored on an e-commerce platform? Or ever questioned what on Earth are cookies when you visit a website and see those annoying "Accept cookies" popups? Read this article to understand what are these.

What is cross-site tracking

Cross-site tracking is a practice where a user’s online activities are tracked to know about their behavior and preferences. This happens when we surf on the internet and visit different websites, these websites have certain mechanisms deployed which can help them collect information about the user interacting with the website. While cross-site tracking is not always a privacy or security concern, but it becomes a problem when a user's data gets collected without their knowledge, without them knowing with whom this data is being shared, how they are using it, where it is being stored and for how long their data will remain with them, whether they have any control over what personal information is being shared and if they can discontinue sharing their personal data or not.

Cross-site tracking can be problematic because it can be used to generate a comprehensive and detailed profile of a person's choices, activities and behavior, without their knowledge.So lets understand how cross-site tracking works?

Cross site tracking is done primarily with cookies and scripts but it can be done in many creative (and sometimes creepy) ways.

A short note on cookies - Cookies are small pieces of data stored on a user's device by websites they visit. They are used to remember user preferences, login sessions, and other browsing information. Cookies enable websites to provide a personalized experience and track user behavior for analytics and advertising purposes. However, they also raise privacy concerns, as they can potentially be used to track users across different websites.

Suppose that you visit an ecommerce site named ecomm.in and you search for products. This website makes use of cookies to identify you and your preferences, primarily to provide a smooth user experience which is considered absolutely normal. But this website also uses a third party service like an advertising network, let’s say adnet.com for displaying ads on their website. This ad network also sets some cookies in your browser to uniquely identify you.

Now you visit a news website, let’s say news.org and this website also uses the same advertising network, adnet.com for showing ads. As soon as you visit this news website, adnet.com can identify you with the help of cookies set by them when you visited ecomm.in and now they also know that you have visited this news website. These trackers can know what links you have clicked, what was the content in that link, how much time you spent on that page and many other things. This creates a tracking pattern as you surf more and more on the web and your preferences, choices, behaviour, time you spent on web and everything keeps getting logged with the advertising network and they can use this information to show you highly targeted ads based on your activities.

While cookies set by the website you are currently on are not a problem as these cookies will only be used by the server that stored them on your machine, it becomes a problem when third-party cookies (cookies set by domain other than the one you are visiting) are stored on your machine. These cookies will keep collecting your data based on your browsing activities, secretly in the background as you surf on the web.

So, now that we have understood what is cross-site tracking, lets look at a few ways of preventing it.

Content Security Policy

This is a security standard which is used to prevent execution of unauthorised code or scripts on our web pages. This process of executing unauthorised code and scripts is called Cross-site scripting and CSP is designed to mitigate such security attacks.

Every web application has some content which can be served from the website’s server itself, like HTML, CSS and JavaScript files or it can come from some third-party resource like fonts loading from google fonts. When we implement CSP in our web app, we explicitly define which resources are allowed to load content on our web app. This prevents loading of content from unknown sources and thus ensuring that everything used in the web app comes from secure and trust-worthy resources.

CSP is implemented by adding some directives and specifying the resource location/address which are allowed to load content on our web app. This can be done through HTTP headers or meta tags as shown below.

Content-Security-Policy: default-src 'self'
Content-Security-Policy: default-src 'self'; img-src *; media-src example.org example.net; script-src userscripts.example.com
<html>
  <head>
    <meta httpEquiv="Content-Security-Policy" content="default-src 'self'; script-src 'self'; style-src 'self' https://www.some-external-stylesheets-url.com;"/>
    </head>
</html>

In this example, We have added 3 CSP directives - default-src, script-src and style-src. If the value is ‘self’, then styles and scripts will be loaded only from the same domain as the website. But as you can see, in style-src, there is one more domain along with self. This allows loading of styles from the original domain and the other domain as well.

You can read more about CSP here.

Same-Site Cookies Attribute

Modern websites use cookies in order to provide better user experience and sometimes cookies are also used to store important user information like session tokens and credentials in the browser. Browsers work in a way that when any request is made by the user, the information stored in these cookies is attached in the request and then it is sent to the server. But if not specified, this information can also be relayed to servers of malicious websites. The same-site cookies attribute can be used to handle the transmission of cookies while making requests.

It can be configured on the server side and it can accept three values - Strict, Lax (derived from relaxed) and none. Here's a detailed explanation of what each value does.

  1. When same-site attribute is set to strict, it tells browsers not to send cookie data when making cross-site requests. This means that if a request is made to a server of a website different than the website user is currently on, no cookie data will be sent in the request. This cookie data will be sent only through the requests being made on the servers of the same website.

  2. When this attribute has its value set to lax, it restricts the cross-origin requests while allowing them in certain scenarios. This is useful when we need a balance between security and usability. When it is set to strict, security is ensured but certain usability might be blocked.

  3. When the same-site is set to none, cookies can be shared through cross-origin requests but these requests must happen over https. We also need to specify the secure attribute when same-site = none

Consider this example -

Let's say the user wants to navigate from example.com to rough.com. This is a top level navigation so cookies from example.com will be sent to rough.com if same-site is set to lax.

In another scenario, a script is loaded from rough.com on example.com then in this case, no cookies will be sent to rough.com when same-site is set to “lax”.

Now let's say user submits a form on example.com which makes a POST/PUT request to rough.com, then no cookie data will be sent to rough.com when same-site = lax.

Here's a code snippet of a very simple backend server made with NodeJS-express which shows the use of same-site cookies attribute -

const express = require('express');
const cookieParser = require('cookie-parser');

const app = express();
app.use(cookieParser());

app.get('/', (req, res) => {
    // Set a cookie with SameSite attribute
    res.cookie('myCookie', 'value', {
        sameSite: 'lax', // or 'strict' or 'none'
        secure: true, // set to true if your site uses HTTPS
        httpOnly: true // cookie is accessible only through HTTP(S)
    });

    res.send('Cookie set with SameSite attribute!');
});

app.listen(3000, () => {
    console.log('Server is running on port 3000');
});

Now, when you hit this API endpoint from a browser, you can find a cookie named 'myCookie' in the application tab of your browser dev tools. It will be present in the cookies section and it's value will be set to 'value'.

Browser fingerprinting

This refers to a set of tracking techniques which can be used to collect information from the websites that you visit on the internet. Many websites deploy techniques which can be used to collect information about your device like your browser details, default language, timezone, operating system and many other things. These pieces of information are then stitched together to make a unique “fingerprint” that can trace back to you.

The information can be collected to such an extent that it can be used to identify a specific user from millions of users on the internet with almost 90-99% accuracy. This information is collected primarily using scripts that execute silently in your browser and HTTP headers like user-agent and similar. There is no way we can distinguish such scripts from other essential scripts or control what value is being sent through this header.

Fingerprinting is more invasive than cookie-based tracking because we can control what all cookies we want servers to store in our browser, we can delete cookies as well but when it comes to fingerprinting, we have no control over it. So one way to mitigate fingerprinting is by reducing the number of add-ons, external themes and plugins that we use in our browsers. This will send less information about our browsers and hence, our preferences. Another one could be using firefox for surfing purposes. They make use of some really advanced techniques to reduce that data shared through fingerprinting. Other techniques are generalization and randomization. There are certain tools that make use of these techniques and make sure that your fingerprint is less specific and more generic by “masking” your data and making it more random and generic.

Liked the content? Share it with your friends!
Share on LinkedIn
Share on WhatsApp
Share on Telegram

Related Posts

See All