Welcome back to Coursicle!

Looks like you've used Coursicle before. Click on your profile to restore your data.

User ID:

Loginless – A New Standard for User Identification

January 12, 2020

We minimize the data we collect about our users. Originally, this was because we didn't feel technically capable of protecting their data1. Today, we do it because it reduces friction and because we believe it's the right thing to do. That said, we also believe that account systems offer a lot of conveniences. The loginless system we've developed is an attempt to combine these two ideals into a persistent user identity without requiring any personally identifiable information (PII) such as an email, phone number, or password.

read more

An account system needs two things to be complete: restorability and syncability.2 Restorability refers to the ability to restore an identity that was previously tied to a device (think re-downloading an app on your phone and signing in again). Syncability refers to the ability to sync an identity to a new device from an existing one (think signing into your email account on your new phone). We describe the process of assigning, restoring, and syncing a user identity and how this process differs depending on device type.

Assigning

The first time a user visits the Coursicle web app or downloads a mobile app, we generate a random UUID server-side and set the UUID in the user's cookies (browser), iCloud key-value store (iOS), or Backup Service (Android). On the web app, we also fingerprint3 the user's browser,4 hash the fingerprint, and then send this hash to our server along with any application layer data that may help us re-identify the user later, such as which college they go to. We store their UUID and this hash together in our database,5 and on each new page they visit, we update the hash (if it has changed), to ensure we have the freshest hash for them.

In addition to this UUID, we assign the user a picture selected randomly from a set of 350 hand-picked public domain images. This allows users to visually identify whether their device becomes disassociated with their UUID without memorizing it, since they will have become accustomed to seeing the same image each time they use Coursicle. We also display the last 6 characters of their UUID (which we refer to as a User ID) and a link to both a summary of the loginless system and opt-out options.

Syncing and Restoring

When a user on iOS or Android installs the Coursicle app on a new device, or re-installs it on a device, iCloud key-value store or Backup Service automatically restores their UUID to the device (provided they are using the same iCloud or Google Play account).

Any time a user visits the Coursicle web app and does not have a UUID already set in their browser, we silently attempt to re-identify them using their browser's fingerprint and any relevant application-layer data (such as the college of the page they're visiting). If this silent re-identification fails and they actually are a returning user and want their data restored, we provide them with several options depending on whether they have a second device that is still associated with their UUID.

If they have a second device already associated with their UUID, they have 2 options:

If they do not have a second device already associated with their UUID (or do not presently have access to it), they have 3 options:

It's important to note the flexibility of the loginless system. Depending on where on the convenience–security spectrum developers would like their application to sit, they can curtail the options above to a select few. In fact, some applications, such as Lyft and WhatsApp, already use a modified version of Option D as their only method of login.7 We think this is a step in the right direction and we're interested to see how we can take it further with loginless.

Conclusion

Loginless is an experiment we've been eager to try. That said, even if it proves successful at Coursicle, it still may not be widely applicable. This is because our user base is very unique: the lifespan of a Coursicle user is limited to the ~4 years they're in college, most college students have at most two devices they use regularly, students are nicely segmented by college (helpful for re-identification), and the information students keep on their Coursicle account, like which classes they're taking, are not closely guarded secrets.

6/9/20 update: it has become clear that browser fingerprint collisions are much more frequent than we expected. Since implementing the loginless system, 1.74 million browsers have visited Coursicle. 159,000 of these browsers were ineligible for the identity restore process because we detected two or more distinct users had browsers that had the same fingerprint. That means that only 90.8% of browsers using Coursicle were eligible for automatic restoring of data based on their fingerprint.8 We still believe that fingerprinting has incredibly high (≥99.9%) accuracy for unique device identification in certain use cases (such as uniquely identifying devices on a site with modest traffic in a 24 hour period), but we've found that when you remove time windows and expand to millions of devices across the internet, this accuracy plummets.

Among the 1.58 million browsers that the loginless system claimed it could uniquely identify, it ended up restoring users' data 154,000 times on 118,000 different browsers.9 Given the vast number of restores occuring, we've decided to make further modifications to the restore flow of the loginless system. Specifically, after successfully re-identifying users using their browser's fingerprint, they must now take action to confirm that the identity we're trying to restore to their browser is correct.

Before a restore is performed, the user is now presented with their user picture, name, user ID, and up to 4 classes that they've saved to their schedule. They are blocked from performing any action on Coursicle until they click on the recovered user profile, or the "That isn't me" button (which stops the restore process). Although we believe this will better protect user privacy, it does rely on trust and the attention of users, which shouldn't be the case when it comes to account security. Thus, we believe that further revisions of this system are inevitable, specifically those that move us farther away from a silent fingerprint-based restore process and closer to a conventional account recovery process. We believe a great next step is a CAPTCHA-like system where the user proves ownership of the recovered account by picking 3 classes they've taken out of a grid of 12 otherwise random classes from their college.


  1. The stakes of security can be especially high in edtech. Password reuse can put students' university accounts at risk, potentially revealing sensitive financial information and social security numbers.

  2. It could be argued that syncability is really just a specific form of restorability, but it's easier to separate them for explanation purposes.

  3. Fingerprinting involves combining many attributes of a computer and browser (such as the GPU, installed fonts, browser name, etc.) in an attempt to uniquely identify that browser on that computer without relying on cookies or other purgable browser data. Fingerprinting is an understandably controversial technique since it can circumvent users' attempts to block tracking. Therefore we believe it's critical to provide users with an opt-out whenever fingerprinting is being used, except potentially for fraud related purposes.

  4. It's important to note: if a user visits the web app from two different browsers on the same device, the browsers will need to be synced just as any two devices would need to be synced. This is because data cannot be shared between browsers and they will necessarily have different fingerprints.

  5. We actually store a lot more than just their UUID and fingerprint hash. We also store the browser and OS that the hash was generated from, the timestamp of when that fingerprint was most recently seen, the last time it was updated, the last time a collision was detected (i.e. a different UUID generated the same fingerprint hash), and some other miscellaneous metadata. These are critical: by associating each fingerprint hash with a (browser, OS) pair, we allow users to have distinct fingerprints for each of their device types. By maintaining metadata about the fingerprints of our users, we're able to adjust the automatic restorability intelligently. For instance, if a collision for a given fingerprint has been detected, we deem that fingerprint "too common" and do not allow user data to be restored based on that fingerprint. The timestamps allow us to take this binary switch and turn it into an analog one: for example, we may deem that a fingerprint that we see from two different users in a 30 day period is "too common", but one that we see from two different users in a 2 year period is unique enough.

  6. Constructing a special link that's able to tie a browser to a UUID is trivial, but it's not so simple to tie a mobile app to a UUID. This is because links to mobile apps go through the App Store/Play Store, which do not pass parameters to the app once it's installed or opened by the user (if they already had it installed). We use Branch to do this install attribution, but there are many alternative solutions, all of which use platform specific techniques, fingerprinting, or both, to pass link data to mobile apps.

  7. Specifically, the user both signs up and logs in by entering in their phone number and then typing in a code that's texted to them. This actually has a major drawback due to carriers recycling phone numbers which is avoided by the loginless system.

  8. This number goes up to 92.9% when excluding browsers on iOS and Android devices, which is to be expected because generally there is less entropy (i.e. less variation in browser and device characteristics) on mobile devices.

  9. Since 118,000 is 7.5% of browsers eligible to be restored, we can conclude that 92.5% browsers did not need restoring in this time frame. In other words, we can reasonably estimate that 92.5% of our users do not clear their browser's cookies within a 6 month period.