Total Privacy-by-Design.

Posted on: 2022-06-14

Does privacy-by-design paradigm relate only to personal data? What is the most effective, simplest way to implement this important concept? Are any trade-offs necessary? What is the role of the open-source code?

In the previous, first text of this series Why PrivMX? we have presented our motivation and our way of looking at one of the most burning questions of the 21st century. We have come to a conclusion that there is no time to waste and we have to start creating efficient mechanisms to protect the privacy of communication, focusing in particular on the results of teamwork, which is the most important source of innovation.

You might say: “it’s already happening!”, probably having in mind the GDPR and similar, privacy-related legislation appearing in different parts of the world. Yes, we do agree that it’s happening, but such regulations focus on and promote the protection of personal data, which is an important but fairly narrow category of content that we feed into the global network.

Despite the huge importance of the new regulations, the clear benefits and the rapid growth of the discussion concerning digital privacy, there are reasons for concern that, in the long run, this is not going to be so rosy. The discussion itself seems to deepen the constant „devaluation” of the concept of privacy to the level of mere personal data protection and to disrupt the precision of the dialogue about the deeper aspects of our life and privacy in the digital world. One example of this is that people often confuse privacy and anonymity, which are clearly not the same thing.

If we take a broader perspective, we may conclude that due to the complexity of the network and its applications, the matter is so hard to change that not much can be done on the subject of “general privacy”. We at PrivMX think it’s true – although only to some extent. There are, however, some key areas, such as teamwork, that we can (and must!) take care of. Convincing people of that and offering specific solutions is our mission.

Legal vs physical protection

The distinction above is worth noting because it’s crucial if we’re thinking of specific solutions. Despite the fact that the new privacy protection legislation sets the right rules and guidelines or even ensures penalization, it doesn’t make our (personal) data physically secured from being read – as well as the law concerning theft and its penalties don’t prevent your bike from actually being stolen.

In such situations, our thinking switches naturally from legal protection to physical security – you can, for example, hide the bike somewhere. Such actions usually protect us from losing the asset, although it makes life a little more complicated… Nevertheless, we usually tend to agree to that after a quick “pros and cons” calculation.

The situation of you and your team sharing and storing the content online is similar – if you want to be the only people capable of accessing the content, you have to use physical protection, which (unfortunately) will also make your life a little bit more difficult.

Zero knowledge on “the other side”

… meaning “we don’t want anybody (or anything else than our computers) to be able to read our data!” - such a conclusion is a starting point for fixing the situation. Such statements have probably already appeared in your company (congratulations!), but the important part is whether they have been implemented and how.

The digital world in the 21st century is a very complex concept, both from the technical and marketing point of view. Currently, most of the digital services’ providers mention “increased privacy” or other similar features that build trust. It’s easy to feel lost (especially as “privacy” tends to be defined as “GDPR compliance”) and, in most cases, one ends up choosing the tools that seem most convenient… and that immediately contradicts the direction we have indicated above.

It’s not that we wish to start here a discussion about the available solutions (it will however be published soon, in the form of another blog post), so let’s go straight to the conclusion: the most important and notable characteristic is whether the servers involved and their admins „ have knowledge” about the content of our data or not.

Only “zero-knowledge servers” can guarantee what we really want: they physically restrict the people and devices “on the other side” from reading our data. The only sensible way to apply such a system is using end-to-end encryption, on the client’s side. That means implementing an arrangement in which your data is available only for you, on your devices – and only these devices are able to encrypt and decrypt it. That way your content is hidden from “the other side” – locked with a key, just like the bike we mentioned earlier.

Such an approach is the appropriate physical protection of our data in the digital world, “total privacy-by-design”. It’s being used in all parts of the PrivMX ecosystem, on all levels and within all tools – it’s a decision we made at the very start of designing the architecture of PrivMX.

Inconvenient consequences

And this is when the above-mentioned inconveniences, resulting from using physical protection, start to appear. They all come down to this hypothetical situation: before you can use the data (the bicycle) you have to decrypt it (take it out of your bike locker). Though it sounds obvious when you imagine the bike situation, it’s not that clear when it comes to digital services based on zero-knowledge servers.

It is currently a common market practice that the servers making part of a digital service can process and analyze our data in various ways. It allows to easily create connected, complementary systems on “the other side” that provide users with brand-new features based on increasingly deep analyses of their data. In our case, when the servers cannot read the data and cannot do anything with it, the situation significantly differs from the widely accepted technical standards.

In the “fully-fledged privacy-by-design” arrangement, only the users’ personal computers can operate on the data, because only they have access to it. This unusual situation sometimes makes it significantly harder to offer some features, which are obvious in other services. An example could be integrating services based on zero-knowledge servers – it is possible, of course, but it requires additional work, which most often involves setting up “trusted bridges” between such services (that’s why we have created the PrivMX Bot software).

Any trade-offs?

A different, much simpler example of inconvenient consequences of using the mentioned physical protection is the fact that forgetting your password (or losing a hardware key) in an end-to-end encrypted system results in a permanent loss of access to your account and to the data it contains.

This situation is similar to forgetting a key to the place where we keep our bike. Then, a physical solution – clear and usually easily done in the physical world - is mostly impossible in the digital world, where an attempt to fit a password or find a decryption key can take thousands or millions of years, even with the fastest computers.

As such consequences can be very inconvenient or even destructive for many users and their companies, it’s a good topic for a trade-off and controlled loosening of our assumptions. In PrivMX, the main encryption key of selected users (labelled as “managed”) is kept among encrypted data of other users (Team Keepers) who can use it to reset the password of a person who forgot it. This is a tried and true method in teams with less experienced web users. The “other side” in this arrangement doesn’t have access to any of the encryption keys, of course; they remain available only for the Team Members.

By the way: there is another interesting fact that may surprise some of our readers: if a service you use offers a “reset/recover password” option, it means that in most cases its administrators, if necessary, are able to read the content you store. Your password serves there mostly only as a control of online access to the service and prevents other people from the open Internet from entering. Your password usually doesn’t take part there in protecting your data from being accessed from “the other side”, which is worth bearing in mind.

To sum up, the issue of inconvenient consequences and trade-offs – it’s all a matter of balance between the aspects of privacy and convenience. In our “full” approach, it all comes down to implementing the privacy-by-default rule: initally we emphasise and implement privacy in all functions, but in some exceptional cases we soften the uncompromising approach and in a controlled way introduce some suitable yet convenient solutions.

The most powerful force leading to softening the approach in the zero-knowledge servers case are the users’ habits, gained by the years of using “normal” services, providing “normal” level of care about the data. An excellent example could be the issue that is currently the most often reported by the PrivMX Fusion users: the lack of possibility to share your own, encrypted, private calendars in services provided by Google, Apple or Microsoft. People are used to using only one calendar app, where they store all the interesting events, and calendars from the mentioned companies are the most popular. At the time of writing (May 2022) we are starting to work on changing the inner structure of the calendar and allow the users to make their own informed decision about some of the data in their private calendars. Probably in a suitable place in PrivMX there will be a button “Yes, I want to share my data with third-party companies”… which, of course, will not be compulsory.

The digital rule of limited trust

At the end, let’s get back to the general question of privacy – there is one more important aspect to discuss, which is connected to it: trust. In fact, you could say that in the digital world, everything is based on trust, because statistically, there are not many people who have full knowledge about the details of how internet services work. Although we do not know these people, we trust them and entrust them with our valuable content.

In the 21st century, such a careless approach is starting to become outdated, what could be proven, for example, by the wide introduction of personal data protection law. The reason for the decreasing level of trust is the difference between the providers’ declarations and the actual functionality of their services. In most cases, it’s not a result of bad intentions, however it is a fact, since it has triggered such a reaction and a broad discussion.

The physical protection, mentioned earlier, based on depriving “the other side” of the knowledge about the content of our data, provides us with a sort of shield, protecting us from the consequences of such a difference – that’s clear. It can be more difficult to notice the fact that such shield might not be enough.

In services based on the zero-knowledge rule, with end-to-end encryption, only the end computers (so the computers we use) have access to the content -- to be more precise, only the client apps that we get from the service provider. According to “the digital rule of limited trust” we should make sure whether the apps do only what they are supposed to do, encrypt our data properly and send it where we want.

Aside from the obvious requirement of having programming knowledge, the main condition of being successful in this task is the ability to see what the program does, how it manages the data. It’s possible only if the provider shares the full source code with the ability to independently build an app, which you can later run.

Open source code is the feature, that completes our “puzzle” – because of it, the eroding issue of trust in the digital world becomes no longer crucial for us. To put it simply: if we know what the program we use does and we know that it doesn’t allow “the other side” to see our data, the trust is not that important anymore. For us, the end users of the gigantic cyberspace, it’s a perfect arrangement.

Moreover, if the software (source code) license additionally lets us to modify it, we are provided with additional power to influence what happens to our data. Apart from the ability to independently add important functions, we can also make our own decisions about softening the uncompromising approach to data privacy – for example to limit the “inconvenient consequences” specific to our company.

In PrivMX we try to consistently follow that path – PrivMX Team Server and PrivMX Fusion source codes are open and within your organization you can create and use their modified versions. We’re glad that not only our team thinks alike – in the article about open-source tools for cooperation we present different examples of such an approach.

Digital workspace

In this blogpost we have described the most important factors, which are the ingredients of our “fully fledged privacy-by-design” approach:

zero-knowledge servers, which are not able to read the users’ content;
full end-to-end encryption of all the content in all of the tools;
integration with different systems using “trusted connectors” such as our PrivMX Bot;
privacy-by-default – full privacy as the starting point and softening this assumption on demand or through controlled workarounds – to provide the appropriate convenience and/or functionality;
open source code – as a completion regulating the question of trust and the ability to adjust the software according to personal needs.

A question could be asked: is it worth it to create and enter such an unusual digital workspace? It’s difficult to answer on the go, having various habits and, on the other hand, lacking the time for such reflections.

It all becomes clear only later, during intensive online or hybrid work with your team, when new unique ideas suddenly appear in your notes and plans. When you need to store some test results or different important content, passwords or clients’ data and share it with the company. When a casual video call with your Team Members turns out to be confidential or strategic. It’s impossible to avoid such situations in the 21st century – we all know about that.

Then you really feel glad that you have chosen the right online collaboration tool, because you know, that you have taken care of yourselves.

The huge amount of data we generate, send and store is overwhelming: our team, only within one year, has created more than 75.000 messages, 9.000 files, 7.000 tasks and had 900 video calls – quite simply, all this while doing our job. Nobody in our team can precisely say what is among the gigabytes of data – it’s natural and obvious. We know, however, that on “the other side” there is also nobody who can and that our content is not abused by some algorithms or devices. We use PrivMX :)

In the next text of the series we will try to explain the main tool of our ecosystem, PrivMX Fusion – how it was created and what the “Fusion of Tools” is.

If you don’t want to miss new posts, consider following us on social media. ⬇ Thanks!

Matt Muszytowski