Tapchat (Practice Interview Question)

You only have free questions left (including this one).

But it doesn't have to end here! Sign up for the 7-day coding interview crash course and you'll get a free Interview Cake problem every week.

You're in!

Design a texting app.

Kind of like Snapchat. But with text messages, not pictures.

First step: figure out what's in scope for our app.

What questions should we ask?

Don't worry if you've never built a mobile app before—we'll just be talking about the app's overall design. No iOS or Android specifics.

What are we building? A web API? Website? Mobile app? Focus on the mobile app portion for now.

What phones do we support? Android? iOS? Don't focus on specific phones yet; that's a bit more detail than we want at this stage.

Who can we chat with? Assume you'll be chatting with other Tapchat users. Each user has a unique cell phone number.

Is there a length limit for texts sent over Tapchat? Yes: 280 characters. Just like tweets.

What features do we need to support? Let's focus on the basics: sending and receiving messages. If we have time, we'll build up more functionality in our app at the end.

How many people do we want to support? We'll come up with a specific estimate in a bit. Think big—there are seven billion phones in the world.

Alright, this should be enough to get us started. We're building a texting app where users can exchange messages with each other.

Next up: let's crunch some numbers to figure out how big our system will need to be.

How many users do we have? And how many chats does each person send per day?

How many users?

All of our users have to have mobile phones (since we're building a mobile app). And there are roughly 7 billion phones in the world.

Of course, not all mobile users will use our app. Like Snapchat, we'll probably be most popular with people in their teens and 20's. If that's roughly a sixth of the total mobile population, then we're looking at a little over 1 billion target users.

We'd love to have all of those people use our app. But we also want to be realistic when building out our system.

Facebook has spread widely around the globe, and with over 2 billion users, they're reaching about a third of the population. So, let's say we'll get a third of our target audience. That puts us around 400 million users.

As a reference point, Snapchat has around 160 million users. So we're definitely in the ballpark for how many users we could have when we go viral.

How many chats?

This is kind of tricky to estimate. Some users will probably send chats constantly, while others might only open the app every few days.

Let's say the most any user would send is 100 chats per day. That means we should be fine if our system can handle 40 billion chats per day in total. Call it 50 billion just to be on the safe side.

If our users send messages at a consistent rate throughout the day, that's about 2 billion chats per hour, 30 million chats per minute, or 500,000 chats per second.

Will users really send messages at a consistent rate?

It's tempting to say yes. After all, with a global user base, any variation by time (e.g.: more chats sent in the evenings than overnight) should sort of balance out.

But what about events that cause a spike in the number of chats sent? We'll probably see lots of chats on holidays like Christmas. Or, if there's a natural disaster.

Since we're basing our numbers on the assumption that every user sends 100 chats per day (a high estimate), we have a good amount of extra capacity built in. That should cover any spikes we might see.

So, we'll keep working with our assumption that the most we'll ever see is 500,000 chats per second. Most of the time we'll see fewer, and we really shouldn't ever see more, even if users start sending a lot of messages at once.

How much space?

Do we even need storage? Can't we just deliver each message to its recipient immediately?

Not if their phone is turned off!

So, when a user sends a message, we'll need to store it briefly—just until we can deliver it to the recipient. In most cases, that's shouldn't take too long. But in rare cases (like, a user loses their phone and has to save up for a new one), it might be a while before we can deliver the message. To keep old messages from cluttering up our database, we'll delete any undelivered messages after they've been there for a month.

Okay, so how many chats will we have to store in our database?

As a rough upper limit, let's say that we have to be able to store about 50 billion chats—roughly the number of chats we'd get every day.

Earlier, we decided that each chat would be 280 characters. So, that's 280 bytes, plus some extra information (like its sender). As an estimate, let's say 300 bytes per chat.

Is 280 characters always 280 bytes?

It depends how we're storing our characters. If we only need to support English, then we can use an encoding like ASCII, where one character takes up one byte. But if we want to support other languages or emojis, we'll need a different encoding, like UTF-8. Then, 280 characters could take up more than 280 bytes.

For now, we'll stick with ASCII, since it'll make our calculations a bit simpler.

If we're storing 50 billion chats, at 300 bytes per chat, we're looking at about 15 terabytes of data.

Now that we've got some numbers, let's come up with a minimum viable product that'll handle a small number of users. Then, we'll scale that up to our global audience of texters.

Okay, so to get started, what's in our minimum viable product?

The main building blocks of Tapchat are the user's phone, servers for handling requests (to either send a new message or request received messages), and some storage for tracking chats that haven't been delivered yet.

Drawing out all these components on a whiteboard, here's what we've got:

mvp

This isn't going to scale too well, but it gets the big pieces of our system on the drawing board and gives us something to work with.

Let's keep fleshing this out. What's our data model?

We're mainly keeping track of one thing: chats from one user to another.

Each chat has a sender, a recipient, and contents.

How should we store users, like the sender and recipient, in our database?

There are a few different ways we could do it.

We could have our users pick a unique user ID when they sign up or assign them one automatically. Do we need that though?

Since each user has a distinct phone number, we could use that to identify users. Since international phone numbers are at most 15 digits, if we stored them as strings it'd be 15 characters for each user, which is 15 bytes.

We don't have to use strings though. We could convert our phone number strings into actual numbers and store that. We'd need to be able to store up to 10^{15} numbers, which fits in 64 bits.

We'll go with that, since it's the most space efficient.

So, here's what our database looks like:

Chats
bigint sender_phone bigint receiver_phone char[280] contents
296 bytes

Now that we've got a basic data model in place, let's run through the steps involved for sending and receiving messages.

Say one user, Klarissa, wants to send a chat to Ajit. What does that look like under the hood?

Klarissa types out her message in the Tapchat mobile app. Once she's finished, she'll tap "Send."

This sends Klarissa's message up to our app servers.

If we can deliver the message to Ajit, we'll do that. Otherwise, we'll add Klarissa's message to our database.

At some point later on, Ajit will open up the Tapchat app, which will connect to our app servers and request any new messages.

When we query the database for messages pending for Ajit, we'll find Klarissa's message, send it to Ajit, and delete it from our database.

What if Ajit turns off his phone after we send the message but before the message gets processed by the app? If we're not careful, we might lose the message.

To prevent this, we shouldn't delete a message from our database until we confirm the message has been successfully delivered to the user.

Great, looks like we've got the basic logic in place here.

Interviewer: Let's dig into communication between the app servers and recipient a bit more. How does the server get messages to the recipient?

Since we're working with mobile phones, we can send push notifications.

How do push notifications work under the hood?

Each device manufacturer maintains a push notification service. Apple's is the Apple Push Notification Service (APNS). Google's is called Firebase Cloud Messaging. When a phone starts up, it sets up a secure connection with its appropriate Push Notification Service.

mvp

Each app that is allowed to send push notifications must register with the phone's operating system (iOS or Android). The phone OS requests a device-specific token from the Push Notification Service and gives this token to the app.

mvp

Application servers (e.g.: our Tapchat server) can use this token to send notifications to the Push Notification Service, which forwards the data to the user's device on behalf of the app.

mvp

So, when our servers get a message to be delivered to another user, they can turn around and send that message to the recipient through a push notification.

Of course, we'll need some way to handle cases where delivery of the push notification fails for some reason. (Maybe the user disabled push notifications. Or the push notification service is down. Or something else went wrong.)

In those cases, we'll need to wait for the user to manually request messages from us. We could build this into our app's user interface (UI), adding a button for requesting new messages or mimicking other apps by having users swipe down to fetch new messages.

We could also periodically check for messages in the background, assuming our app has the right permissions. But we need to be careful— if the app checks too often, it'll drain the battery!

Putting those together, here's what we have:

If you've enabled push notifications, you'll be notified as soon as a new message arrives. If your phone is turned off, you'll get the message on startup.
If you've disabled push notifications, or something goes wrong, you'll get new messages when you manually request them.

Interviewer: Okay, nice. Just brainstorming for a bit, how do you think the Push Notification Service delivers messages to the user's phone?

One way the communication could work is with polling. Periodically, the phone would reach out to the Push Notification Service and request any new notifications.

mvp

One issue: polling doesn't scale! Apple has millions of iPhones, and they'd all be trying to connect to the Push Notification Service every minute. Yikes—that would take a lot of servers and network bandwidth.

Other connection types like long polling or server sent events are a bit more complex to set up and administer, but they generate much less network traffic and scale more gracefully.

It's hard to know exactly what is happening under the hood of the Push Notification Service. But it probably looks a lot like server sent events.

Interviewer: "Okay, great. Let's get back on track and move on to scaling up the system."

Right. We want to handle 50 billion chats per day. What changes should we make to scale up to this load?

With billions of chats per day, we're going to need a lot more than one web server.

We'll go with a tried and true technique here: adding an autoscaler like Amazon's EC2 to spin up new servers as they're needed, on the fly. To distribute load between them, we'll add in a load balancer.

mvp

Users (or rather, their apps) interact with our servers in two ways:

Sending a new chat message to another user.
Asking for chats for themselves.

We could design our system with two sets of servers: one set for handling incoming chats and another set for delivering chats to users. That might seem like a good idea, since it'll cleanly distinguish between these two actions.

Alternatively, we could just have one big group of servers, and we'd make each server capable of handling both actions.

Both options are fine. Let's go with the second option here. Being able to treat each server as interchangeable will make our system easier to maintain and administer, avoiding two separate sets of infrastructure.

Let's shift gears and focus on our database. How can we make it handle billions of messages per day?

Does every message make it to our database?

Maybe not. If we can quickly deliver a message using push notifications, we might not need to store it at all.

But we'll definitely need to store messages that we can't deliver via push notifications.

Just to be on the safe side, we'll make sure our system can handle the case where no messages are delivered quickly and all of them end up being written to the database.

With 50 billion chats per day, we're writing and reading about 15 terabytes of data every 24 hours. That's a lot of data, reads, and writes—probably more than a single machine could handle.

Once we break our database across multiple machines, our database management becomes much more complicated. Traditional SQL databases offer nice guarantees about data consistency (everyone sees the same data in the database) and are flexible to query. But they're trickier to scale across multiple machines. NoSQL databases scale well in a distributed system, but offer fewer features and guarantees.

Which type of database should we use?

The most important features of our database are scalability (since we want to be able to handle millions of phones) and speed (since we want our app to be reasonably fast). And, we don't need strong consistency guarantees or database transactions.

Given these priorities, a NoSQL database seems like a better choice.

Interviewer: "Can you talk a bit about the different types of NoSQL databases? Any ideas on what type you'd use here?"

There are four main types of NoSQL databases:

Key/Value Stores are like dictionaries. Look up a key, and get back the value associated with it.
Document Databases also use keys and values, so they're quite similar to key/value stores. Usually document databases are a bit more flexible though—they might structure the values as JSON to support indexing or filtering by value.
Graph Databases are like graphs, storing nodes and edges connecting them.
Column Databases are sort of like SQL databases, where each entry (row) is associated with different fields (columns). But in a column database, rows don't always have the same columns, and not every column has to be filled.

"There are so many NoSQL databases! How do I know which one to use?"

Even if you're not familiar with the specifics of a particular NoSQL database, you should know the different types of NoSQL databases (i.e.: the ones we listed above) and pick one to use.

There might be more than one option that makes sense—in some cases you can use a column database or a document database. Pick whichever one seems to best fit the problem.

We want a NoSQL database that pairs a user's phone number with a list of messages to deliver to them.

All of these types could work. Let's go with a document database here, since it cleanly corresponds to our model where we're storing a list of chats for each user, keyed by their phone number.

Our database entries look something like this:

{ key: recipient's phone, value: [ { contents: [ message ], sender: [ sender phone ] }, { contents: [ message ], sender: [ sender phone ] }, ... ] }

What if we chose graph databases instead? That could work!

If you have experience working with NoSQL graph databases, you could suggest using a graph database like Neo4J. Each user could be a node in the database, and we'd add edges to represent messages from one user to another. When we receive a message, we'd add a new edge to our graph. When asked to deliver messages, we'd iterate through all the edges adjacent to the recipient's node.

Interviewer: "Okay, good. Taking a step back to our earlier discussion on SQL and NoSQL, what are common challenges with NoSQL, and how might they complicate our app's design?"

Broadly, NoSQL databases take away three "nice" features that SQL databases have:

Transactions: We usually can't lock records or prevent two users from accessing the same data at the same time.
Flexible Queries: Most NoSQL databases provide key-based lookups. They usually don't have fancier SQL queries, especially across multiple tables (e.g.: joins).
Consistency: We usually can't guarantee that everybody sees the same data in the database. Our queries might return data that are slightly out of date / stale.

Which of these could complicate the design of our app?

Let's run through that list with our app in mind.

We don't need database transactions, so that's not a problem.

We're not doing any fancy lookups—in fact, our data model is really quite simple. So that's OK too.

We do make updates to our databases though, so we need to figure out how we should handle concurrent updates and if we're OK working with data that might not be completely up to date.

What are the specific scenarios in Tapchat that we need to consider?

Here are two big ones:

What happens if a user receives two messages at the same time? We need to make sure they both get delivered; we don't want to drop any messages.
Once a user's messages are delivered, they can be deleted from the database. What happens if a user asks for new messages before an earlier delete clearing out old messages propagates through the database?

Let's start with the first one. What do we need to do to make sure we don't drop any messages?

If a user gets a message from two people at the same time, then it's possible that our database will have two distinct values (messages) associated with one key (the recipient's phone number).

mvp

That's an inconsistency because looking up the one key can produce different results depending on which database server ends up handling the request.

Here, we'd want to make sure that our database handles this scenario by combining all of the pending messages for a user and setting that as the value for that user's entry.

mvp

Of course, the specifics of how we'd implement this will depend on what database we're specifically using. Something to keep in mind when we're choosing between NoSQL options.

Some column databases, like Bigtable or HBase, store their data sorted by key. These databases allow you to efficiently look up all entries whose keys begin with a certain value.

If we had chosen one of these databases for Tapchat, then we could assign each message a unique key, sidestepping the issue of concurrent key updates entirely.

We might construct keys by concatenating the recipient, sender, and a timestamp together. Since these databases are sorted by key, as long as we start our key with the recipient's phone number, all messages sent to the recipient will appear next to each other in our database. That makes it easy to find all the messages that we need to deliver when they're requested.

What about the second scenario? Where a user's request for messages sends back old messages that have already been delivered?

How might this scenario occur? Let's walk through the steps:

Suppose Sofia checks for new messages—sending a request to our app servers and getting back a chat from Min:

mvp

Once the message has been delivered, we can delete it from the database. But, while the delete is being processed, Sofia issues another request for new messages.

This request goes to a different server—one that hasn't processed the delete yet. This server delivers the same message from Min to Sofia:

mvp

Whoops. Sofia just got Min's message twice.

Where should we try to fix this? In our databases? App servers? Or on the mobile devices?

It might be tempting to try to tweak our databases to prevent this scenario from happening. After all, the whole problem is that our database is sending us back stale data.

Hang on a second though. We're using a NoSQL database, which means that we might not be guaranteed to always read up-to-date data. So, unless we are sure to use a database that does have those guarantees, we can't prevent stale reads in our database.

What about in our app servers?

That's tempting. We could give each message a unique identifier and have our app servers track which identifiers they've already sent down to mobile devices.

Why do we need a separate identifier? Couldn't we just use a hash of the message content?

Hashing the message would only work if each message were unique. If someone sent the same message twice, it would have the same hash, and we'd erroneously only deliver one copy!

"Wake up!" "Wake up!"

How do messages get their unique identifiers? Given all the difficulty we have syncing up information between our database and app servers, it'll be hard to ensure that the IDs are truly unique.

One common solution to this issue is to just randomly generate a 128-bit number and use that as the ID. Since there are so many possibilities, it's exceedingly unlikely that the same number will be generated twice.

To be thorough, we should make sure nothing bad happens if we do have the same ID generated twice. In this case, we might end up erroneously deleting the message with the duplicate ID. If we're worried, we could combine the random ID with other message metadata, like the sender and a timestamp, to make it even less likely we'd have two messages that might be mistaken as duplicates.

One issue with that logic though. We have multiple app servers, and there's no guarantee that requests from one device will always go to the same app server. We'd need some way for all of the app servers to agree on which chats have already been sent out ... but that could be tricky to scale.

Another valid solution would be to use load balancers with sticky sessions. Those load balancers will send traffic from the same device to the same app server every time, which fixes the issue we had above where a device requested data from two different app servers.

But there's a tradeoff: we're giving up some flexibility with load balancing. If one server is under a heavy load and another is free, we can't send traffic from any of our users with existing sessions on the busy server over to the free one. All we can do is direct traffic from new users to the free server.

The logic of associating each message with a unique ID and using that to filter out duplicates read from our database seems promising. Can we implement it in our mobile app so that it runs on the user's phone?

That should work! We'll have our app track the IDs of messages its delivered recently. And, if it gets a new message to deliver that has an ID we've seen before, we'll treat it like a duplicate and won't pass it up to the user.

Let's run through our example again and make sure this fixes it.

Like last time, Sofia requests any new chats from our app servers. We send back a message from Min, this time tagged with an ID of 101:

mvp

What happens if Sofia requests her messages again, before the message from Min is completely deleted from our database?

mvp

Our database could send the message again, where it'll arrive at Sofia's phone. But this time, we'll see that message ID 101 has already been delivered, so we can delete it.

mvp

This way, Sofia only sees the message once. Nice!

Here's another approach to the duplicate message problem. Some document databases, like CouchDB, associate data with a specific revision number that gets updated each time the data change.

If we were using CouchDB, we could have each phone track the latest data version its seen so far and include that revision in its requests to the app servers. If our app server does a lookup and gets back a revision that's older than the one sent by the user, those messages have already been delivered and can be deleted.

Interviewer: We're almost out of time. Just to make sure we're on the same page on what you've built, walk me through the steps of sending and receiving messages in Tapchat.

Here's a final picture of our system's architecture:

mvp

Here's what happens on a send:

A user composes a chat in our Tapchat app and hits "send."
The app connects to our app servers, passing through a load balancer to distribute connections across our pool of servers.
The app sends the chat contents and recipient to the app server.
The app server tries to deliver the message using push notifications. If the push notification doesn't go through, we'll insert a new record into the database to be delivered later.

There could be other users sending chats to the same person. That's OK—our NoSQL database can reconcile concurrent updates.

And, here's what happens when a user requests any new messages manually:

The user's Tapchat app connects to our app servers, passing though our load balancer.
The app server queries the database for any messages to be delivered to this user (looking up the user's phone number as the key).
The app server sends any messages back to the user and deletes them from the database.
The app filters out duplicate messages by deleting any that have an ID we've already delivered.
The app displays new chats and notes their IDs so they won't be displayed again.

If the same user immediately requests their messages again, and their request goes to a different server, it's possible our app servers will send back the same message twice. But, thanks to our ID checks, it'll only make it up to the user once.

Having fun? Here are some additional pieces of the system to think about:

How close is Tapchat to Snapchat? Think about what would be required to implement the following extra features:
- Sending pictures instead of text.
- Only showing messages for a short period of time (e.g.: seconds) before deleting them.
- Deleting chats that haven't been delivered for 30 days.
What if we wanted to support group chats?
What if we wanted to support read receipts?

Start your free trial!

Where do I enter my password?

Actually, we don't support password-based login. Never have. Just the OAuth methods above. Why?

It's easy and quick. No "reset password" flow. No password to forget.
It lets us avoid storing passwords that hackers could access and use to try to log into our users' email or bank accounts.
It makes it harder for one person to share a paid Interview Cake account with multiple people.

Ready for more?

Check out our full course