Ep. 106 Securing the Internet with Josh Aas, Sarah Gran of ISRG
Michael Lynn: The Internet Security Research Group, ISRG. Probably not a name you're super familiar with unless security is a part of your job. The ISRG was founded in May 2013, with a really important mission: to reduce the financial, technological, and educational barriers to securing communication over the internet. Now, it's more likely that you've heard of their flagship project. It's called Let's Encrypt. Let's Encrypt is a wonderful certificate authority. It's easy to use, and it makes the process of obtaining and implementing certificates, as a part of your website, super simple. Josh Aas and Sarah Gran from ISRG, join me today to talk about some of the other projects that they have, and also, to talk about the fact that Let's Encrypt is not just for websites. We'll learn more after the break. My name is Michael Lynn, and this is the MongoDB Podcast. MongoDB World is returning to New York City. MongoDB World 2022, the future runs on MongoDB. It's a conference for creators, disruptors, and transformers of tomorrow. You can register today. Head on over to Mongodb.com/ world- 2022. Join us from June 7th through the 9th, for three days of announcement packed keynotes, hands- on workshops, deep dive technical sessions that'll give you the tools you need to build and deploy mission critical applications at scale. We've got a special offer for you folks. There's a discount code, it's podcast. Use the code podcast to get 25% off the currently already discounted rate. Head on over to MongoDB.com/ world- 2022. Remember to use the code podcast for your special discount.
Sarah Gran: Hello, I'm Sarah Gran. I am the VP of Communications and Fundraising at ISRG, the nonprofit behind Let's Encrypt and our other projects, Prossimo and Divvi Up. I've been at ISRG since early 2016. I remember when I started, I felt like I had kind of missed all the buzz and excitement because Let's Encrypt had already launched, and I'd known Josh for a long time and had been talking with him about this process. But as it turns out years later, there was a lot of excitement yet ahead. Here we are today with three great projects and certificate authority that is serving a huge portion of the web.
Josh Aas: I'm Josh Aas, I'm the Executive Director at ISRG. I helped start the organization in 2013, I think is when we got going. Didn't really become public until 2015. I've been working on this for eight or nine years now, and I'm very excited. We're still here. As Sarah said, we're still around, still doing good work. I used to work for an organization called Mozilla on a browser called Firefox. My job there for a while was to run the networking team, then work on some strategy stuff for Mozilla. One of the biggest problems we had back then was the fact that so many websites were not using HTTPS, so just a huge percentage of the web was not encrypted. If you're working on a browser, that's a really frustrating problem because you want to provide a secure experience to everyone that's using your browser. But if there's no transport layer encryption, there's only so much you can do, right? Your users are just vulnerable all over the web, and no amount of code you right in the browser is going to make that better. It's a very frustrating problem to think," Well, we can't do much to help our users in this respect if we can't figure out how to get hundreds of millions of websites to switch HTTPS." It just seemed at the time, like a pretty, almost insurmountable task. Right?
Michael Lynn: I mean, today, it just seems ubiquitous. I mean, I don't think I've visited a non- HTTPS website in... I mean, I can't remember. To what do you attribute that, the success of deploying that type of technology?
Josh Aas: Well, at the time we thought about a lot of different options. How do we get there? We were really looking for options that were going to get us to nearly a hundred percent encryption in a relatively quick amount of time. We didn't want to be in the IPv6 timeline where we say where we want to go, and 20 years later, we're not even close to goal. We wanted something that we thought would work relatively quickly. We talked to a lot of people about why their websites aren't encrypted, and we realized that most people want to do the right thing. Most people want to turn on encryption. It was kind of a pain. The main pain point was that it was difficult to get the TLS certificates that you need, or the SSL certificates. People had trouble figuring out," Where should I get it from? Which kind of search should I get? Is it affordable? Can I automate this, or is it going to be like a big manual process issue all the time?" It just was really difficult to get and manage the certificates that you need for TLS. It felt to us like if we could solve that problem, solve most of the problem, then people would hopefully just start turning on TLS a lot more frequently. I think that theory has turned out to be true for the most part. We started Let's Encrypt with a focus on making sure that certificates were free, but also really easy to get. All this stuff is really in service of ease of use, even the fact that it's free. Certificates can be cheap. They could be pennies, but if we charged anything, even if it was pennies, you'd have to go find a credit card. You'd have to enter that. What do we do when your credit card expires? You're going to have to set up a billing system. If you're inside a corporation, you want to set up a little website. You're going to go get a credit card approval from your finance department. There's a lot of friction around payment, regardless of how much the actual payment is, so we just wanted to get rid of that. Then also, just make the experience on the technical side really easy. There should just be a client or a piece of software that gets to cert for you. It should happen in a matter of seconds. It should be really easy to automate that and have it renew automatically. We have this really big focus on ease of use when we built Let's Encrypt, and now it's pretty easy to get certificates. I think we're at 90%, maybe 93% in the United States for encrypted page loads. I think it's 82% q worldwide, but I think in a lot of... I think that's not reflective of everybody's experience. I think that's the average. I have this theory that a lot of the unencrypted paid loads are sort of clustered inside corporate environments, like intranet type applications, so if you're on one of those things, maybe you have more unencrypted page loads, but if you're like me, I have my browser set to not even load pages that aren't encrypted, and I rarely encounter that. For me personally, I'm pretty close to a hundred percent HTTPS and that's exciting. I think fixing the certificate problem on the web was a big part of the solution, and I say part because it's not the only thing that happened. There's been a lot of great advocacy and research work. There's been a lot of improvement in the other tools that people use to administer and run websites. A lot of hosting providers have decided to just make certificates the default instead of sort of an add- on service. The browsers have done a really great job of essentially pushing people in the right direction. It was a big deal to make things like geo location require HTTPS. If you want to use more powerful, modern features on the web, the browsers won't let you unless the connection is using HTTPS. There's been a lot of work done on the user interface stuff around HTTPS, so the browsers have done a great job, researchers, software developers and other tools. It all came together to get us where we want to be pretty quickly.
Sarah Gran: Just to put a fine point on where it is that we came from, josh mentioned that now today, HTTPS page loads are over 90% in the U.S., and over 80% globally, but when Let's Encrypt launched, that was less than 40%. 39% of page loads were encrypted at that time, and that's after this technology had been available for 20 years. In 20 years, that's all the progress that had been made, so it is really amazing that the web has made such a significant and broad reaching change in such a short period of time. A lot of that is attributed to exactly like Josh was saying, the wide variety of people coming at this problem from different angles to help support moving it forward.
Michael Lynn: So let's talk a little bit about how MongoDB uses Let's Encrypt. I mean, we have a long and storied relationship even prior to becoming a user of Let's Encrypt. So Sarah, maybe you want to talk a little bit about the relationship between MongoDB and Let's Encrypt?
Sarah Gran: Sure. MongoDB has been a longtime partner of ours. MongoDB actually became a financial sponsor of Let's Encrypt, prior to even getting a single certificate from us. Let's Encrypt and ISRG are 501( c)( 3) nonprofits, so we're able to do our work because companies like Mongo have supported us. It was a great start to the relationship to hear that Mongo was invested in us making the change to the web that we were hoping. Then the story got more interesting when we heard from some folks a or two later at MongoDB that you all were interested in actually using Let's Encrypt certificates. So today, every time you spin up a cluster in Atlas, TLS is a requirement, and that comes from Let's Encrypt. We are now providing many certificates to MongoDB customers, and we think this is a pretty interesting use case for our certificates because it's a little bit different than what many people might think of as a standard use for our Let's Encrypt certificate, basically on a website. To show that these certificates are useful for protecting all of Atlas's client data and all intra cluster network communications is a cool example of how TLS really needs to be at all of those layers of an infrastructure when you have that kind of connection.
Michael Lynn: You mentioned the Let's Encrypt certificates are free. They're created by the users of Let's Encrypt from the command line or via an API. This obviously costs you money. I mean, obviously you've got infrastructure in place. You've got to maintain a high level of security. I'm sure you have operating expenses. How are you able to do that?
Sarah Gran: Well, we are a nonprofit and we ask that people who care about security and privacy on the internet for themselves or for everyone else help contribute to making this work possible. Let's Encrypt operates on budget of about$ 4 million every year, and the majority of that cost goes toward our staff. We have site reliability engineers and a team of software engineers that maintain both our older CA code and the certificate authority itself. That requires a sizable team of people. The majority of the money that we raise for Let's Encrypt goes directly into supporting those people, making the CA happen, and we're glad to have MongoDB as a long time supporter of that work.
Michael Lynn: I love the transparency. I believe that's one of your key operating principles. You want to talk a little bit about those principles?
Josh Aas: Yeah. Transparency is pretty important to us. Running a certificate of authority is all about trust. If people don't trust you, then really nothing else you do matters. Transparency, I think is at the heart of... If you're going to ask people to trust you, you should be transparent about what you're doing. We're transparent at a bunch of different levels. There's a lot of information out there about our policies and how we run our service. There's our CP and CPS documents you can find on our website. But we also have, for example, a lot of open source software. We don't exist to create open source software, we exist to run a service, but open source is important to us and if you want to see how our CA works, there's a repository on the Let's Encrypt organization called Boulder. That's the software we use to run the CA so you can see exactly how we do things. So yeah, we've got a great community that we work with that helps answer questions, and disseminate information about what we're doing.
Michael Lynn: The Boulder Project, is that an open source project? Do you take pull requests?
Josh Aas: We do take pull requests. What I would say though is Boulder is really intended to work on our infrastructure, so it's not something that we really recommend that other people run because it's just not general purpose software. It's deployed in a very specific certificate authority environment. People definitely contribute pull requests now and again, but it's not common. The purpose of it being open source is more for the sake of its transparency than it is to try to build up a large developer community. The needs of our certificate authority are pretty specific, and we're not really able to develop the software in such a way that it's generalized and works for a lot of other people.
Michael Lynn: So I'm curious about the other projects within is ISRG. Do you want to talk a little bit about some of those?
Josh Aas: Sure. So we started Let's Encrypt, publicly announced it in 2015. Since then, we've started two more projects. So the second project that we have is called Divvi Up. In more general terms, the technology behind it is called privacy preserving metrics. The idea here is organizations that create applications, whether on a cell phone or the web, they want to collect metrics from their users. Sometimes that's about the performance of their apps, or sort of meta information about how their applications are working, and sometimes the collection of metrics is part of the actual functionality of the application itself, the content of the app itself. There's always been this tension between organizations wanting to get data and user privacy, and we're trying to do something to deal with that tension in a better way. We want to get the organizations the data they need about the users in aggregate without actually collecting individual user data. The technology behind our Divvi Up service works essentially by the application will take the metrics data from users and break it into two different pieces. There'll be some local anonymization applied to those pieces on the device or on the website or something, and then each piece gets encrypted for a different destination. One piece will come to ISRG or our Divvi Up project, and another piece will go to a second provider, who's running essentially the same service. We'll take a bunch of that user data in batches, which by the time it gets to us is unintelligible because it's been split in half. We don't really know what it means. It's just a bunch of numbers to us. We sum those numbers all together and do some math on that, and produce what we call a partial aggregate sum. The other provider is doing exactly the same thing. Once we've created this aggregate sum, we throw away all the data that we got from individual devices. Then we send those two partial sums back to the application developer where they can zip them together, and that produces the final aggregate data about the entire population of users. But nobody along the way, like once the data leaves your device, nobody can really interpret or understand your data. It really helps a lot with user privacy, but also gives organizations access to the aggregate data that they need. We think this is a pretty big deal. It's applicable to just about every application out there, every cell phone app, every website. Organizations want to collect data without violating privacy. They don't want the liability that comes with sitting on a bunch of individual user data, and the system really just solves that problem.
Michael Lynn: What stage is the project at today?
Josh Aas: We've actually deployed a version of the already for COVID- 19 exposure notification applications. If you are running exposure notification functionality on your iOS or Android device, there's a good chance that we are the service for, or one of the service providers behind that. We built that... We got asked to do that in, I think September 2020, and by the end of the year, within less than three months, we had built out the whole system. We've been learning a lot from running that exposure notification platform for a while now. What we're doing is taking the lessons learned there, creating a more refined public specification for the technology with some of our partners, and then building it out so it's a service we can offer to everyone else. Just like Let's Encrypt, we want to have a really big focus on ease of use. One of the most unfortunate things about the internet today is that there's so many pieces in anything that you do. Just to set up a little website, you've got to deal with SSL sorts. You've got to configure HTTP headers. You got to all your SSL parameters. There's all sorts of knobs to tweak to make sure you're doing the right thing. If you're running an application, you don't... Metrics are just one part of what you're doing, right? We just need to make that as easy as possible so that they can make the right choice about user privacy without having to spend too many resources on that because otherwise, they're going to make the choice to skip it. They're just going to say," Well, this is a little less safe, but we don't have time to do the safer thing." Again, just like Let's Encrypt, we need to make our Divvi Up services easy to use as possible so that it's just so easy to do the right thing.
Michael Lynn: What a great project, and it seems like a natural extension really well- aligned with the mission. I want to let folks know, check the show notes. I'm going to include links for more information on these projects and to ISRG. What other projects are you working on at the ISRG?
Our third project is called Prossimo, and it's really to bringing better memory safety to the Internet's most critical software. A lot of the software that we rely on, on the internet, is written in either C or C ++. Sometimes assembly, but mostly C. These languages are not memory safe, meaning it's pretty easy for programmers to make errors in how they manage memory. Those errors often lead memory safety vulnerabilities. If you've got a phone and you've ever checked the release notes for a software update on your phone, iOS, Android, whatever, look at the list of software vulnerabilities. It is pretty consistently one long list of memory safety vulnerabilities. It's memory corruption, use after free buffer overflows, all that kind of stuff. Not everything is that, but a pretty large percentage of it. I think maybe 70% for most larger tech companies, our major vulnerabilities are memory safety issues. The interesting thing about this problem... We've got a lot of problems on the internet, right? A lot of things that we could do better. The interesting thing about this problem is that we know how to solve it. There are lots other problems I don't know how to solve. I don't know how to solve programmers. I don't know how to stop programmers from making logic bugs in applications. I don't know how to solve IPv6 adoption. We do know how to solve memory safety, not just mitigate it, but just solve it. We have safer languages, and now we have languages that are safer and work on a systems level. C was always attractive because it doesn't have a run time. It's very fast. The goal has always been... The dream has always been to have a fast system level language with no run time that's also memory safe and we have that now, at least in Rust. We want to try to take this really central internet software infrastructure and move it away from C and C ++ to safer languages. There's a lot of C code in the world, a lot of C ++ code out there. We're not trying to change all of it. We're not trying to do the top 1000 projects, or even the top 100 projects. We're really trying to find like literally, what are the top 10 most important things that the internet depends on, and let's make that stuff safer. If you go to memory safety. org, you can see a list of what we're working on. A great example so far, like the internet really depends on the Linux Kernel. That's just at the heart of everything. That is 3 million lines of C code, frequently suffering memory safety vulnerabilities. It's going to take a long time to change that. I'm not under the illusion that we're going to move the Linux Kernel to a safer language in a few years. That's not going to happen, but we need to get started. The Linux Kernel's going to be around a long time. The internet is going to be a long, around a long time. There is a huge time horizon for investments to pay off, even if they're big, long term investments. We have Miguel inaudible contracting with ISRG's Prossimo project. He's working on adding Rust as a second language to the Linux Kernel so that people can start writing device drivers and other components. Rust, which is a memory safe language. He's doing great work. Honestly, when we started this project, I knew that the Linux Kernel was obviously maybe the most essential piece of the modern internet, but I thought it was so difficult, even with my extreme optimism and ambition about this stuff, I left it off the list, but man, they have made great progress. It's really incredible. It's getting pretty close, I think, to getting merged. Yeah, he's doing great work. Then we're also looking at DNS. DNS is a huge part of how the internet works. Again, millions of lines of C code. We have so much evidence that this is dangerous, right? It's a problem. Let's fix it. It's going to take a little while. It's going to take a little work, but we can do it. This world is full of talented software engineers. It's not an insurmountable problem. We're working on investing in some DNS implementations that are much safer than what's out there today. We just signed a contract to get a much safer NTP implementation for Network Time. We have invested pretty heavily in a new TLS library called Rustls, that we ultimately hope someday will replace open SSL in the ecosystem. Open SSL is notorious as anything else for its security issues, many of which are related to the fact that it's written in C. I really hope that 10, 20 years from now, we're not still running TLS stacks that are written in C. I hope we learn along the way. Our goal is to bring about that safer future for the really central, most important software on the internet.
Sarah Gran: This project is a little bit different from the Let's Encrypt project or Divvi Up in that we aren't writing or maintaining most of the code that we are trying to influence or improve. Prossimo is focused on awareness, education, advocacy, and developing really clear strategies for how to make a meaningful impact and difference in a small handful of the most important security sensitive software that exists out there. We have developed a process for first identifying what the risk is to the software, how vulnerable it is to memory and safety vulnerabilities, and then we calculate the opportunity and the degree to which there's four things that we look at. Is there a library or component that can be used across a lot of different projects, meaning that this impact can scale? Can we efficiently replace key components with a memory safe library? Are funders willing to fund this work, and are the maintainers on board and cooperative? With that set of four criteria, we have built out a small list of initiatives that were focused on in order to really make sure that this work has high impact at a reasonable cost and is supporting the developers and maintainers who have worked long and hard to make it so popular and essential.
Michael Lynn: How is it possible that you're able to fund this, and track this, and manage this over that great period of time?
Josh Aas: Like Sarah said, we identify the projects based on some risk criteria, and then we look for opportunity. For each project that we identify that seems like it has good opportunity, we sit down and make a really clear plan and we go out and talk to the maintainers, we talk to users, we talk to the community, and we make the clearest plan we can for what a roadmap looks like to get to a memory safe future. In a lot of cases like Sarah mentioned, the answer is not to just go rewrite it from scratch. The answer is try to do it modularly, piece- by- piece over time. Not only is that more realistic in a lot of cases, but it gives you gains more immediately, right? You can take an existing piece of software like cURL, which is really important and really ubiquitous. I don't think it makes sense to rewrite cURL from scratch in a new language, and I don't think the maintainer of cURL thinks that either, so we're not going to do that. We went out and we had someone swap out open SSL for the Rustls TLS library. That's something you can use today. We contracted with the maintainer to swap out the C based HTTP implementation in cURL, and replace it with a memory safe HTTP implementation. Right now you can go build cURL if you want, and you can build it with memory safe HTTP and TLS networking, and that's available today. We didn't need to rewrite cURL from scratch to do that. On top of this, a really interesting thing I think is that the maintainer of cURL didn't even have to learn Rust to do this work. These memory safe libraries come with C APIs, so a C programmer working on an existing project can just take the C wrappers around these memory safe libraries and put them right in there, and they don't even need to learn a new language. That's huge. We think a lot. We spend a lot of time thinking about how we can get plans that work for the maintainers, do this piece by piece. In a lot of cases, you don't even have to learn the new language. We try to invest in those modular pieces. Our investment in the Rustls TLS library is going to pay off all over the place because there's a lot of projects where we can go and just remove open SSL and replace it with Rustls.
Michael Lynn: All with the goal of a safer internet. That's amazing. That's great work. So these are projects that are in play today. Do you want to talk a little bit about what the future true looks like? Are there things on the roadmap?
Josh Aas: I'm sure there are other things that we'll do. We have a certain amount of organizational capacity and we're pretty focused on building up, especially the Divvi Up project these days, because that's a service. It's pretty resource intensive, so we're very focused on building that up. We're always looking for other opportunities to make a difference when it comes to privacy and security on the web, but we're probably not going to launch a whole new project every year or anything like that. I'm not really sure what'll be next. We're getting projects two and three off the ground here.
Michael Lynn: You know, I neglected to ask about the revenue model for ISRG. I mean, I know you are a nonprofit, but the Divvi Up service is that a fee for service, or is that something that's also offered for free or for donation?
Sarah Gran: The work that we're doing with COVID- 19 exposure notification apps has been funded through grants and sponsorships. However, long term, the general availability version of Divvi Up will likely have a sliding scale model of payment where we need to cover the infrastructure costs for providing the service along with the staff members who will maintain it. Our hope is that we will have revenue generated from customers who can pay who will help support the important efforts on the internet that need this kind of protection and security and privacy, but don't have great financial resources. That will help make this more broadly available to more organizations with different types of capacities more quickly.
Josh Aas: Yeah. The fundamental difference here is that to give someone a TLS certificate, I'm not really sure what it costs, but it's pretty cheap. We do hundreds of millions of active certificates. We do millions of certificates per day, and we do that on a budget of$ 4 million a year. So when you can get that kind of efficiency out of the service, it makes sense that you can just do it for free. Divvi Up is potentially a lot more data coming from users, a lot more processing costs. It's just not possible to provide that to everyone for free as much as we would like to, so we are going to have to charge, like Sarah said.
Michael Lynn: Been a great discussion. We'll include links in the show notes for all of the projects, as well as for ISRG. You can visit abetterinternet. org for more information there, but check the show notes for all of the details. Sarah, Josh, thanks so much for your time today.
Josh Aas: Thanks for having us.
Sarah Gran: Thank you.
Michael Lynn: Thanks so much to Sarah and Josh for joining us today, and thanks to you for listening. If you want to check out more information about ISRG, visit abetterinternet. org. Check out their projects, Let's Encrypt, Prossimo and Divvi Up. Have a great day.
DESCRIPTION
In this episode of the podcast, we talk with Josh Aas, and Sarah Gran of the Internet Security Research Group (ISRG) to learn about their mission to secure the Internet through projects like Let's Encrypt, the automated digital certificate authority, and Prossimo, which focuses on transforming risks around memory safety in popular open source projects.