In the most concise definition, the Extensible Messaging and Presence Protocol (XMPP) is an instant messaging protocol designed for extensibility with an email-style architecture.
If none of those words make any sense to you, don't click off yet. This guide will break everything down and make it digestable, even if the extent of your computer knowledge is how to send a text with Siri.
If you don't want to know any specifics and just get on with it, the next section will give a rundown on the bare minimums to get your chat running. The remainder of the guide will be for building on that knowledge. I recommend reading the full thing because knowing how it works will aid with any problems you may run into, or help avoid causing problems with inexperience. Also I didn't write all of it for nothing. Also learning is fun!
3 basic components are needed to use XMPP: a provider, an app, and a contact
In the same way you need a provider for internet service or email, you need a provider to get on the XMPP network.
XMPP Providers provides an intuitive list of providers ranked by category with information on each via summaries and more specific details. However, if you don't care to research the providers, you can just use mine, xmpp.party
Making an account under most providers is done in-app, explained in the next section.
The app (referred to for the rest of this article as a client) is the interface you use to connect and talk on the network, like how you use a message app to text someone. There are many clients available, but the most popular ones are Gajim for desktop computers, Conversations for Android, and ChatSecure for iOS.
Once you've installed a client you will be greeted with a login screen. If it asks for a server provider, enter the name of the provider (i.e. xmpp.party). If it asks for a username and password, select the account creation option and enter in a username in username@provider.com format and the password you desire to use for your account.
If all went well you should be in your new account looking at an empty list. Obviously, no chat app is worth anything unless you have another person to actually chat with. Adding a contact varies by client, but most mobile clients should just have a simple Start chat button and most desktop clients should have it under an account menu. Then, adding a contact is as simple as entering the contact's XMPP address, as if you were sending them an email via their email address.
FOR DESKTOP USERS: While not strictly necessary for chat function, a key feature and reason to use XMPP is message encryption. Mobile clients should automatically encrypt, but if you see any broken padlock icons or warnings about plaintext, make sure to enable OMEMO encryption.
Assuming no user error, that is all you need to do to get on XMPP. Welcome aboard!
Now you know how to skate by on the minimum. But you may have had questions along the way. Extensibility? What do I need a provider for? Why are apps called clients? Protocol?? All will be explained, and this section will tackle what XMPP being a protocol means.
Protocols are something we all use everyday without thinking. When you want to buy things at the store, you give them to the cashier, they scan all of them and tally up the total, and they tell you that total and ask for cash or card. If you use cash they receive the cash and give back the correct change, and if you use card they enable the scanner for you to swipe and enter your PIN. When payment is confirmed, they give you the receipt and tell you to have a nice day. These customs form a store payment protocol, and if you want to buy from the store you must follow these customs.
When a computer wants to perform an action with another computer, they similarly use protocols. Protocols ensure that no matter what differences 2 computers may have, they can work with each other as long as they can use the same protocol. Just like how you can still order from a store even in a different country with a language you don't speak.
XMPP is simply just a protocol for delivering instant message over the internet. A computer learns to speak XMPP when you install a client and knows the customs for connecting to a provider.
But speaking of providers, how do they fit into the equation, and why do we need one? Can't you just send the message to their computer and call it a day?
At the beginning, I mentioned an email-style architecture, as well as making allusions and comparisons to email. Email and XMPP are quite similar in some ways, and I will break it down by comparing real-life post offices to email, then email to XMPP.
To send someone snail-mail, you need a post office. You can cram letters in your mailbox all day, but without the mailperson you may as well be talking to a wall. So you have a post office, but a new issue arises: this one post office cannot serve the entire world. The mailpeople are unionizing and striking because they work 48 hour shifts delivering cross country and drivers drown trying to drive the ocean to Europe. So the post offices are restricted to serving one city at a time and are multiplied into a big worldwide network of offices that can all mail to each other by knowing the city they are located in. Sound familiar?
Email works like regular old mail too, and so does XMPP. A server replaces a post office and the city it serves are users that have registered under it, like how you "register" for a post office's services by moving to their city. For example, instead of New York or London, your email could live at google.com or outlook.com. And just like how you have a home address to indicate what building the mail goes to, your username is your account's home address. This is what forms the typical email and XMPP address username@provider.com: literally this user AT this provider.
The provider is not only responsible for account creation and management, but also the logistics of message routing and inter-server communication. Providers increase efficiency, reliability and accessibility for the whole network.
The client you installed is the mailbox outside your house which allows for the sending and receiving of messages, and your gateway to the provider and the rest of the network. This is why it's called a client, because like a restaurant, the server serves you (the client).
A key feature of XMPP is the extensible nature of it, allowing for it to extend beyond simple text message. This is achieved via plugins, which can be applied to the server or to the client, depending on the plugin.
A common example of a server plugin is HTTP file share, which allows for sending pictures, files, video, you name it. Any provider worth their salt will provide filesharing, so do not fear. However, different providers will have different limits for size and how long the files are stored, which is something to consider when choosing.
Every client features support for plugins, and almost every client already comes with a handful of plugins pre-installed or built in. The most important plugin being encryption plugins, but there are also plugins for voice and video chat, filtering plugins such as anti-spam, and convenience plugins such as quick reply buttons.
Although it's technically a "plugin", encryption is the real reason one uses XMPP, and is bundled with practically every client. Encryption is like writing a note in a secret code that only the intended readers can understand. There are several options for encryption, but the most popular (and the only one I know enough about) is OMEMO, so that is what will be covered.
OMEMO is a protocol for end-to-end encryption, meaning it is unreadable to outsiders because it is encrypted on the sender's device and is not decrypted until it reaches the recipient device. Even the provider does not see the encrypted message contents, the post office does not open your letters. End-to-end encryption is like the secret language you and your friend tried inventing in third grade: both ends understand the conversation in their heads, but any peepers listening in will just hear gibberish.
The 3 key features of OMEMO to grasp to understand its function are multi-end encryption, fingerprints, and trust.
If only the intended recipient can see the encrypted message, then how does it work across multiple devices? If my friend sends a message to my desktop computer, would I not be able to see them from the same account on my phone? This is where multi-end encryption steps in, providing a message a route to multiple recipient devices and allowing all of them to read the message. This not only enables one account on multiple devices, but also allows for group chat encryption so that all members can read each other's messages.
OMEMO achieves multi-end encryption using fingerprints. Fingerprints are a quite long string of numbers and letters OMEMO randomly generates and assigns to every device under it. Fingerprints are like real fingerprints, as their main purpose is identification while still maintaining anonymity. This not only enables multi-end, but is great for security as any impersonators, hackers, or bad actors can't replicate your fingerprint. The idea is to publish your fingerprint in a place other than XMPP itself so that if anyone receives a message from someone claiming to be you, they can cross-reference it with your published fingerprint and see if it's really you.
But all this security would be meaningless without trust, the final component. Most clients will have a setting enabled by default called Blind trust, which lets new contacts message you with encryption before final verification. Once you verify they are who they say they are, you can mark the device as fully trusted. This will often be reminded via a Manage Trust popup or similar. It is important to note that the multi-end nature means every device has a different trust state, and verified devices will be prioritized over blind trusted and untrusted devices. If your friend uses desktop and their phone, make sure to mark both as trusted! You don't need to literally always compare fingerprints if you expect the message and know it's them, but trust is a good mechanic to know in the event of a hack or even just a random DM. Mobile clients seem to replace manual fingerprint comparison with a QR code you can scan with each other to mark as trusted, so make sure to send a friend your QR.
Lack of understanding of the mechanics of OMEMO is the source of a large portion of error and frustration, so make sure you understand trust to keep your conversations secure and private.
Though you may know how XMPP works, there may still be a question in your mind: what's the point? Researching providers, clients, and juggling trust seems like a lot of work when 30 seconds could get you a phone number and a text conversation up and going. But here's why I believe the benefit to be worth the effort:
The internet oldheads might be reading this and wondering why this new kid XMPP can just walk into town and steal all of IRC's street cred like it didn't do all the work to get their Quake deathmatches going. IRC was one of my first considerations, but there are a few reasons why XMPP comes out on top:
Matrix is by far the most popular of the "open messaging protocol" family and first choice for many Discord refugees, me included. And not for nothing as it has potential and is pretty great on paper. It could be the best protocol in town with more features, more plugins and less work... on paper.
My issues with Matrix are just what I've gathered from input and firsthand experience and I don't know all the most technical details, but this article provides a more thorough examination on the details if you are interested.
"Discord alternatives" in my view are other services that are often brought up when people mention leaving Discord. Ranging from proprietary services like Guilded to "privacy-oriented" FOSS software like Stoat and Fluxer.
Proprietary services like Guilded are just wholly retreading the Discord issue. Even if they are great now and supersede Discord, they will just become the new evil and repeat the entire cycle. Non-negotiable pass.
The FOSS "privacy" ones are a little bit trickier to see the issue with, since they got smooth marketing about how respectful they are. In my experience, very few live up to the hype or really take steps to ensure privacy. For example, Fluxer's privacy policy promises to not sell, but still collects lots of personally identifying information, scans all media "for safety" (red flag), and relies on many privacy-violating third parties such as Cloudflare. Their nature as specific services instead of protocols also limits their usefulness and longevity, even if it is self-hostable. Also, there are just way too many of these cropping up all the time and I can't be bothered to keep track of every single one.