Ep 136 Designing Sustainable Architectures with Snehal Bhatia
Snehal Bhatia: Hi, my name is Snehal Bhatia and I'm a solutions architect at MongoDB. Stay tuned as Shane and I discuss and explore how to build sustainable and environmentally friendly architectures.
Shane McAllister: This is the MongoDB Podcast. My name is Shane McAllister, and welcome to the show. We're grateful to have you join us for yet another episode. The environment and sustainability is not something that immediately springs to mind when we think about data. The cloud as a notion seems very abstracted from the reality that the cloud is merely somebody else's servers somewhere else, always on and always available. The year- on- year massive reduction in the cost of storage and connectivity, however, does have a real world consequence and we should be mindful regardless of how we use resources. In this episode, I talk to Snehal Bhatia, a solutions architect with MongoDB about designing environmentally sustainable architectures and the extent to which the IT industry contributes to global emissions. Spoiler alert, it's the same as the aviation industry and growing. We discuss on- premise versus the cloud, how developers can optimize the architecture of a database so it can be designed with sustainability in mind through appropriate provisioning, data shaping, indexing, queries, and charting and lots more. So let's get started. Welcome to another MongoDB podcast. And today our guest is Snehal. Snehal is our solutions architect based in London. So Snehal, hello and welcome to the podcast.
Snehal Bhatia: Hey, Shane. Thank you. Happy to be here.
Shane McAllister: It's great to have you on board. I have seen your presentations on this subject of what we're going to talk about a few times, and I really was keen to get you on board. But before we dive into that, tell us a little bit about yourself. How your career path to date and how you've ended up at MongoDB, and your day to day role in MongoDB.
Snehal Bhatia: Of course. So I started out my journey in computer science and generally the world of technology and software with my academics. So I did a bachelors in computer science and engineering, followed by a master's in computer science. And in all those kind of courses, what I started focusing a lot on was data oriented stuff. So did a couple of machine learning projects an academic research level, AI projects. As part of that, what I started discovering a lot was that there's all these cool tech things going on, but then there's a whole element of how it all affects humans and society as a whole. Since then, that's always been a subject that's intrigued me. So I did a little bit of research on the ethical considerations of machine learning algorithms of maybe having a wide scaled IoT kind of connected system around our cities in a world where everything becomes a connected object and the impacts of security and privacy. And lately I also came across certain other concerns that are out there as a whole climate being one of them. So yeah, I'm really always intrigued and passionate about these projects and I really believe in making the world a better place for the lack of a better phrase for that. So that's where my journey started in computing and software and also just in general how my interest in these topics came about. So I worked before joining MongoDB as an implementation consultant for insurance softwares. And since I've been in MongoDB for almost two years now, and I've been working as solutions architect. So what that means in my day to day is that I work with our customers, our users, to help them find the most optimized way, the way to deploy and work with MongoDB and it's various offerings in a way that suits them and their needs because it is a very flexible and adaptable technology, so it does need a lot of expertise from that angle. So work collaboratively with our customers to bring that to life.
Shane McAllister: Excellent. I know you have a deep MongoDB expertise, but that's not why we're chatting today. I first came across your presentation you did at MongoDB World back in June in'22 in New York all about designing environmentally sustainable architectures. And it really piqued my interest because I think you had some incredible stats there at the very beginning about the IT industry itself and the percentage of global emissions we currently admit that it's roughly the same as the aviation industry.
Snehal Bhatia: Yeah. And that's also a conservative estimate.
Shane McAllister: Wow. I did not know that we were on a par with that. That is something because I think that the aviation industry gets a bad rep for emissions probably because it's very easy to understand. Fuel goes in, it gets burned to travel somewhere. But we don't have the same understanding of emissions for IT. I think it was 2.8% of global emissions as IT? But it could grow, right?
Snehal Bhatia: Yeah. So it's estimated to go up to almost 23%, 24% by 2030. So 2030 or 2030, why is that an important kind of year for us? Is that based on a lot of conventions and conferences and UN level acts, and the Paris agreement, if you've heard about it, what they have defined in all those agreements is that by 2030, we need to make sure that our global temperatures don't go beyond a total of two degrees of rise as compared to what we had in the pre- industrial levels. So if that two degree kind of metric is breached, what that would mean is essentially we'll go beyond the tipping point of what our earth can handle. And coming back from there would be a much harder challenge. So we'll start seeing natural calamities and disasters of the scale that may then be out of our control. And even the two degrees just like the absolute maximum, the goal we're all trying to aim for here is to stay at the 1. 5 degree level. So this all has to be done by 2030 and the breach projection right now it's less than eight years because average global temperature has been growing year after year to an extent where in eight years, if we keep going at the same rate, we are definitely bound to breach this. I mean, I'm sure you and our audience here is aware the global temperature increases caused by greenhouse gas emissions, which is then caused in return by everything we do. And the ICD industry or the information and communications technology industry is a big contributor to that.
Shane McAllister: It's not very far away, 2030. How are we doing? I know and think in your presentation that you gave on this, you had a link to the climate clock, an online website that was calculating where we were with this goal of the one and a half, two degrees. How are we doing? Are we on track? Are we off track?
Snehal Bhatia: So I think the studies showed that 19 of the hottest years ever have occurred since 2000.
Shane McAllister: Oh, wow. Okay.
Snehal Bhatia: The expected breach if we keep going can be anywhere between four years from now to seven to eight years from now if we go with the 1. 5 degree target in mind. So really, we are today not in a great shape, but on the positive side of things, and as a solutions architect, I always like to think about solutions is that it's also estimated that the ICT industry has the power of reducing global emissions by 45 degrees by 2030. And that's not just by cutting down our own emissions and our own kind of making sure that the way we develop and design technology is done in a good manner is also by automating operations, things that right now consume much more electricity and power. In other industries would be that manufacturing, be that construction, be that even aviation. So the ICT industry in general is bound to keep making improvements to have drive efficiencies in that space. But on the flip side of it, we can't be the reason for contribution of 23% of global greenhouse emissions. So that's not going to give us the same... Because we're not balancing it out in that case. Which is why, while technology continues to be a big contributory factor to improvements in all other areas of life, if we think about it from a sustainability perspective, it's also important to be conscious that it has a reverse effect because we consume more and more resources and the way we develop technology might as well just have a negative impact.
Shane McAllister: And it's amazing because obviously our reliance on technology, our use of technical products, technical services is growing all of the time. When it comes to the cloud and infrastructure which we'll get to, I think for certain people, not necessarily ourselves in the tech industry, but for certain people they abstract away the notion of the cloud. They kind of go, " Everything is in the cloud. That's great. That's brilliant. I know it's all stored there." But that analogy has gone down so well that people fail to realize that at the end of the day, the cloud is still a physical item. There's a server. There's storage. There's computers involved as well too. So when it comes to data and data architecture, how can we play this part in helping to reduce the emissions to reduce our greenhouse effect, et cetera as well too? So what are the best ways to go about that?
Snehal Bhatia: I mean starting with the deployment aspect, you mentioned the cloud, but we also have around the world, lots of organizations and big companies who are yet to even adopt any kind of cloud technologies because of lack of knowledge or fear, concern of security. And then they end up deploying in their own private data centers or on- premises solutions and so on. It has been shown that private data centers or self- hosted systems, they can have up to 84% more carbon emissions as compared to what public cloud providers such as AWS or Google Cloud or Azure can do for us. And the reason for that is very simple is because these companies or these organizations, they are not in the business of building and maintaining data centers, right?
Shane McAllister: Mm-hmm.
Snehal Bhatia: They're in the business of providing the best of class service or technology products or software products, whichever industry they serve, right? So essentially that infrastructure, even though critical, it's not their main concern. So even if they would have the know- how and the skills and the knowledge to make it more and more efficient, that will probably never make their priority list. Right?
Shane McAllister: True.
Snehal Bhatia: So that's where these cloud providers such as AWS and Google Cloud and Azure, where they come in is this is their core business. This is their core area of research and development and everything. So without going too much into everyone's statistics, like AWS, their studies show that they're 88% more energy efficient than on premise cloud providers. They all have carbon neutrality pledges by 2030 or 2040. Google Cloud is already a hundred percent powered by renewable energy if we look at-
Shane McAllister: 100%. Wow.
Snehal Bhatia: Yeah, that's the statistics that they have published on their own websites. And then Azure is aiming for a hundred percent renewable energy by 2025 and also to be water positive and zero waste by 2030. Because it's not just the energy that goes behind these data centers, it's the material that is used in constructing the physical data centers. So the way they make their bricks, the way they make their cement, the way they construct this data center, the water stewardship behind it, and all of these things are key factors in optimizing a data center. And these public cloud providers are investing heavily in making sure that they're always top off their game.
Shane McAllister: Excellent. So I suppose it seems to be the number one improvement you can make is if you happen to have servers on premise or on site, if you can move to the cloud, then that's a drastic reduction in terms of your own emissions and a move towards sustainability. So MongoDB, gentle plug, allows you to store on all the major cloud providers, AWS, GCP, and Azure as well too. So by going that route, Snehal, we are leveraging their expertise and their knowledge in, I suppose, how they go about their own optimization, their own cooling, the fact that they use renewable energy, et cetera. So that seems like a really, really good first step for anybody to take, right?
Snehal Bhatia: Yeah. There's two things I like to say here. So the first aspect is definitely moving towards these manage infrastructures that make all their statistics public and available and knowing that it's definitely... It's kind of how... I don't know if I made this term up or I stole it from somewhere, but I like to think of it as sustainability of scale, much like economics of scale. So we think about economies of scale when we think about adopting managed services and so on. Because in that case, they're putting in all this work and they can make it available to us for cheaper because they can sell it essentially to millions of people. Similarly, when you think about a data platform or database or any other service for that matter, you can think about how the optimization of that product or platform is all then taken care of by the company that's developing it. So in our case, for example, with MongoDB Atlas, which is our fully managed cloud- based database as a service, we take care of all the technology that goes behind ensuring optimized backups and restores or ensuring optimized workload distribution and optimization. The database upgrades and patches that we apply automatically on Atlas instances and the security and the compliance that we maintain for the whole because we run pen testing and we run business continuity, disaster recovery testing and all those kind of things. We make developer tools available and so on. We do all this effort once and then we've got thousands and millions of atlas deployments around the world. All of these people leveraging all this testing and development that we did in their own deployments. So they don't have to go and start developing tooling for scaling and for developing backups and whatnot. All they have to care about then and also invest resources and time and developer effort and energy, all those things. It's just their application deployment. So the thing that's truly differentiating for them is what they're doing. And then we're doing all of the rest of the work once that can then be leveraged by millions of other organizations. So of course one aspect of looking at this is just reduction of effort and the fact that you don't have to worry about all this. But on the flip side, if you think about it, you know are saving months and months of developer time, effort, energy, resources, they would consume whatnot. The same logic goes behind adopting managed services such as Atlas or any other. It's just essentially sustainability of scale.
Shane McAllister: I love that. I love the idea that economies of scale, when thinking about a data center, we can understand that we can... It's tangible. There's a building there somewhere with associated cooling equipment, with associated hardware, et cetera as well. And that is very understandable and how they manage to leverage their experience there. But when we talk about applications and there are billions of applications globally, internal, external applications, et cetera, all needing roughly the same level of infrastructure. So what you're saying there is we take care of that, we can optimize for that, we can manage, as you say, the security and the compliance et cetera as well too. So that in itself is also further reducing complexity, but also further reducing the need for replication
Snehal Bhatia: Yeah. And the need for resources as well that need to make all this efficient. Also, from our perspective, just like the data centers make their infrastructure as optimized as possible so that they can host more and more consumers on the same infrastructure. We want to make our data platform as optimized as possible so we can use the least amount of resources to service most amount of people. So yeah, there's definitely all those aspects that add to the value of a managed service when you think about reducing the overall environmental impact, the negative impact that your application might have on the earth. I mean, I think we talked about how moving to the public cloud might be a first step if an organization isn't already there. But again, we can go into another level of granularity there where each cloud provider has tens and if not a hundred regions available, so you can deploy on many other many different regions. And not all these regions are as sustainable as the others.
Shane McAllister: Okay. Of course.
Snehal Bhatia: So for example, in AWS they have three carbon neutral regions right now. I think they're all in the US. So they're zero carbon regions and Google Cloud, they have some regions in the Nordex, et cetera, where it's all the time, it's coming from fully renewable energy. Same with other cloud providers as well. So some regions, just by virtue of how energy is generated in those geographical regions, they can either be powered by fully renewable energy or by the virtue of how those centers are designed. They can be carbon neutral and so on. So another level of optimization there after adoption of just going to the cloud is also being conscious about which regions you're deploying your applications on. Now, a lot of the guys-
Shane McAllister: Because we have a choice, right? So MongoDB last I looked as about 90 odd regions or locations and choice between the three main cloud providers. Location is a concern obviously, but keeping your data close to your users or close to your customers is super important, right?
Snehal Bhatia: Yeah, absolutely. So we've got support. Atlas can be deployed in over 95 regions, I think across the three major cloud providers that we're discussing right now. When you're selecting a region, the first concern would be like, " What is my late query latency like? What are my geographical kind of data residency requirements? Am I following the laws or not?" But then there's a lot of cases where that might not be super important. Maybe your application can't tolerate a little bit more latency or maybe you want to have your main transactional nodes of your database hosted in your main region, but then you also have a very read heavy application. And a lot of your users are just sending queries for reads. So that's where maybe you can decide to put read- only nodes that can be part of your main database cluster in regions that are more sustainable. Or similarly you might have analytical load running on your database like maybe some kind of machine learning algorithms or some kind of business intelligence sort of workload. So it might be possible for you to put an analytics node, which is again this concept of workload isolation that we have in Atlas where you can have your transactional load and your analytical load running on the same database cluster. So you can choose to deploy your analytical nodes, for example, in a region that is carbon neutral because those workloads are not latency sensitive. It's okay if that reporting process runs for three hours and returns results in two minutes instead of five seconds or something like that. So being cautious about your different workloads and the fact that with Atlas we have this data platform that gives you this flexibility to distinguish between these workloads to deploy where you need. Even, for example, some applications might be heavy on analytics but low on transactions. So you can choose to deploy bigger analytical nodes so provision more resources and lesser to your main clusters. So being smart and optimized with that as well.
Shane McAllister: So it's fascinating. I think we've discussed at length there about location and choice of hosting et cetera as well too. But once we have chosen our database and our hosting and our location, the architecture of that database can also be designed with sustainability in mind. There's lots of optimization you can do with how that database operates and works in itself. Correct?
Snehal Bhatia: Yeah. So that's absolutely right. I think there's this code that says the greenest energy is the energy that we did not use in the first place. Right?
Shane McAllister: I like it, yes.
Snehal Bhatia: So definitely once we have made all those considerations, there's lots of things we can do from an architecture perspective. So right now it's been estimated that about 40% of the instances around the world are sized at least one size larger than needed.
Shane McAllister: Okay.
Snehal Bhatia: The reason for that is because not all workloads are predictable. Sometimes you need to provision for the peaks that you might have. So if you have an e- commerce application, you're running Black Friday sales days or something like that and you want to make sure that your application can handle that. You might just end up running year round on that extra provision infrastructure just because it's not easy to go up and down when you're managing it yourself. So that's where, again, a managed service such as MongoDB Atlas in this case comes in handy because we have feature of cluster auto scaling where your cluster, again, your database can scale up or down based on your workload without causing any downtime, without causing any effect on your application. So of course it introduces all those operational efficiencies, but also then it reduces the amount of resources you're consuming to only just what you need. And this can be also taken one level further where our new... Well, not new anymore at the serverless instances.
Shane McAllister: It's been a while.
Snehal Bhatia: Yeah. The serverless instances that were released. So that just takes that elastic scaling to a whole new level as well.
Shane McAllister: Yeah. And that brings in... Obviously, MongoDB had tiers and as you say, are you using that tier effectively? Have you over- provisioned for that tier? And there are many cases where you need to over- provision, but perhaps what was the figure? 40% was over provisioned, I think. So you can go down a tier or indeed jump on our new serverless. Do we have any inkling of how many have moved to serverless in MongoDB or is there any other concerns about company organization moving to serverless, which again is in my mind like the cloud, this notion that there's nothing there, but serverless is not serverless. There's still a server, there's still a computer, there's still storage and data behind that as well too. But you need to get the true efficiencies there. You need more and more people to move to serverless because that in itself is over capacity because you want people to move there, right?
Snehal Bhatia: Yeah. No, exactly. So in general, a typical server is set to consume around 15% of its total computing capacity while still consuming a lot of power, right? So if you have a server up and running, even if you're not consuming it completely, you're still consuming the amount of power that it takes to run it. And that's where with serverless instances on the back end, what it allows us to do is host multiple of these workloads on the same server, hence optimizing it. But I think the catch here is that the true power of serverless technology in terms of making it more environmental friendly is when it can be adopted at scale. So the way we can optimize it to the most is by making sure that more and more workloads are running on the same amount of infrastructure that we have provision, right?
Shane McAllister: Mm-hmm.
Snehal Bhatia: So it's easy to draw a parallel here. For example, like e- readers such as Kindle, it's set that the amount of resources that go into manufacturing of a Kindle and all the stuff that it takes, unless you read at least 60 books on it, you are not cutting even the environmental damage it causes because oftentimes-
Shane McAllister: Wow, I hadn't heard that. That's super interesting.
Snehal Bhatia: Yeah. I also found that I myself am a proud owner of an e- reader and I would think that I'm not using paper books anymore like how nice.
Shane McAllister: Have you 60 books on it? Have you reached the limit of target of 60 books?
Snehal Bhatia: I think I have, but I have a lot of friends who haven't or who are not as active readers. And so if you take that analogy back, unless-
Shane McAllister: It's a really good analogy.
Snehal Bhatia: ... enough people areusing it, you're not achieving the full benefits of it. You're not making an impact like that. And around 60% of the organizations around the world are yet to adopt serverless technologies. And the reason for that is just educating the current kind of workforce on how to use and work with serverless or the fact that the current existing integration testing frameworks and all those kind of frameworks that we have for making sure everything is up and running correctly, they have integrated serverless technologies in them or the fact that people are still concerned about what kind of security control the serverless can offer because it's not been adopted or tested out at scale yet. So there is a lot of hesitation on serverless adoption in general. So I think just getting up to speed with all the technical advancements and getting rid of these old notions of how it's not secure or optimized or whatnot would probably be a step we all need to take in general.
Shane McAllister: It's probably the same leap of faith that it took to go from on- premise and servers to cloud is probably needed to go to serverless then for the next step for many organizations. You mentioned testing there and I know that the clients and the developers that we deal with, they would have a development environment, a testing environment, and a production environment generally. They go in fits and spurts of sporadic use, I suppose. Certainly the testing and the development environment. Do we need a database on all the time? Can it be paused to save on energy cost and help the environment?
Snehal Bhatia: Yeah. I mean, if you think about it, most of the customers that I work with seem to not be using their non- production environments outside of work hours. So roughly that's almost 50% of the time that those environments are not put to use, but we still need to provision resources. We can't just spin up them up and down. That's where a MongoDB Atlas has a very unique capability to pause a database cluster on demand. So essentially on the click of a button or you can even make a schedule for when you want to pause and unpause it by using Atlas triggers, which is part of our developer platform. It allows you to just write custom code and functionality to perform certain actions based on certain things. So if you do that like about 44% of resources in general for any project or spend on non- compute resources on an average.
Shane McAllister: Nearly half, wow.
Snehal Bhatia: Nearly half. So if you can pause using functionality such as what Atlas provides, not only will you save on costs because in that case all you're paying for is just what the disc is using because the data is still stored somewhere. So it's really a fraction, a very, very small fraction of the total cost. So it's not only going to result in a huge cost reduction, but it's also going to reduce the emissions that you have. And that's a great way of optimizing your overall consumption.
Shane McAllister: Another aspect of optimizing I've heard you speak about, which was really interesting to me. I know within MongoDB be here we talk about the data access together should be stored together. How do you optimize for data access queries, index, et cetera? Could you give us some idea of ways to think about that to make it, again, more efficient, more performant, but ultimately more environmentally friendly as well?
Snehal Bhatia: Yeah. So I know we've spent a lot of our time today discussing about the architectural aspect, but this is another aspect that might fall maybe a little bit towards the developer side as well is actually just defining the data model. I think what MongoDB allows you to do, as you very rightly said, it lets you model your data based on how your application accesses it or how your application needs to use it. So if you compare it to the format that we're most familiar with, which is tables and rows and columns and relational structure, that structure imposes a certain way of storing data, which means that your application then has to write workarounds to retrieve it in the way that it needs to use it, which then leads to lots of intensive operations such as database joins if you're familiar with it. It leads to less optimized queries that might end up... You might need to paralyze them more because just because of how they are natively not as performant and so on. So leveraging a flexible data model such as the document model of MongoDB can really help you model your data in a way that your most frequently run queries are optimized for and they can access it with much less resources and without the need for expensive operations such as joins that consume a lot of resources. You'll probably read about this even in the... Because a big part of the sustainable texture and there lies on the web frontend side of things or the web side of things. And there's that principles of sustainable web design that are out there. If you just Google it'll pop up that all talk about how having a more optimized data model is needed to reduce the overall carbon emissions as well.
Shane McAllister: Okay. You touched on it there briefly. If I'm not sure, have I optimized my database and if I'm not sure, maybe I'm dealing with something that's been legacy for a while or maybe I'm new to the role and I've come in to a company and I'm looking at that, how can I monitor that? What sort of tools does MongoDB have to inform my decision as to whether I have optimized my data and my architecture appropriately? Obviously, for all of the things we care about, speed and resilience, but obviously environmentally as well too, what information could I get?
Snehal Bhatia: So I think that the key there is then that it should be easy for you to retrieve that information, which means that your database technology should expose all of those metrics to you. So with Atlas for example, the data platform exposes over a hundred key performance indicators. A lot of these are in the form of visual graphics and charts and most of these can be exported out. You can set alerts to it so you can track things like storage and utilization and data transfer, and compute, and so on. So the key here is to then understand how exactly do you want to analyze these things? So not just independently. Maybe when you're looking at performance and you're looking at general operations, you'll probably look at, " Okay, this is what my brand performance looks like. This is what my disc utilization looks like." But when you start thinking about it in terms of sustainability, it's important to start thinking about these in relation to one another. So the way to think about it would be that almost sustainability would become another one of your non- functional metrics or non- functional SLAs. So for example, you might have availability defined and you might have response time and quality of results and costs and so on. So sustainability then becomes another non- functional metric for you that will then be used that we can then use all these metrics to study. So for example, you might have chosen a carbon neutral cloud region for your deployment. Right? Let's say it's all based in the US but if all of your users are based somewhere in Australia and they're all sending, it's a very read heavy like a Facebook application that has millions of users and they're all concentrated on another side of the world, they're going to be sending so many frequent queries to this region in the US and all of this is going to incur a lot of network transfer resources. So not only is that network transfer cost, but also the resources needed for all this network transfer. So you need to balance these metrics against one another, which is where it really becomes important to start defining these as your... Especially when you're doing project planning and things like that, it needs to become as important as performance and cost and stuff like that.
Shane McAllister: Yeah. I love that idea. I think that a sustainability metric would be of interest to different people with inside an organization, but as you say at the moment we're looking at performance metrics and it needs to live alongside that I think ultimately. In your talk, I remember, I think you mentioned eco mode for applications. I was really impressed with that concept because we are so used to always on, always available, instant response, kind of instant activation, et cetera. I think going back to the earlier notion of the cloud, we don't think about it as a physical entity. I think if people were given that, and in your talk you mentioned you see it, green icons associated with airfare and travel. And I think if we were given that option when we're signing up for some online service that how frequently do we want this, how important is it to us? And maybe there's an eco tier as well too, whereby it would suit both the requirements of the company itself to sustain their eco credentials, but also for me as a conscious consumer of saying, " Well look, I'm quite happy to pay for this service and I might pay a little bit more for the more eco- friendly version or the more sustainable friendly version." Because the market needs to push companies in this direction as well too, right?
Snehal Bhatia: Yeah, exactly. So it might just also be a case of you say that I'm happy with the eco mode. I'm happy that it'll take me two seconds instead of one second for loading my data. So you might as a user be okay with the lower level of performance knowing that you are also then contributing to towards a better cause. As an eco- friendly consumer of applications, you might choose to make that choice. So applications it almost-
Shane McAllister: It'd be great to have that choice, wouldn't it? I know my kids run their phones on low power mode because for them running out of battery is a key problem. They don't need their emails retrieved automatically every 30 seconds or so, or they don't need necessarily the notifications to come through except for some of their favorite apps which constantly annoy them. But they turn on low power mode because in their mind the most important parameter of their mobile device is the longevity of the battery. I think if we take that and flip it, I think it'd be really interesting to see, especially some of the larger consumer or B2C companies put forward those type of plans. If you have, excuse me, a subscription you could opt for something that is more eco- friendly on the environment in terms of what it does and how it operates et cetera as well too. So going back to the question of I want to know is my database, but say I'm running a web application or a site or something like that, how could I know... And maybe it's not on MongoDB yet, how do I know if that is eco- friendly? Are there online tools? Where could I go to learn more? What else could I do?
Snehal Bhatia: Yeah, there's definitely online tools available. A lot of them make their methodology available as well. So there's this website called websitecarbon. com. There's also this website called digitalbeacon. co and they essentially tell you how much carbon or energy a certain website is generating and what that can compare to in your day to day life and so on. So we definitely have that stuff, but we also, what we have then is the Green Web Foundation and climateaction.tech and these are... Or principles. green that are the web programming green principles. These are all organizations and associations working together to find more and more efficient ways of making the tech world more green if I can say that.
Shane McAllister: Excellent. I'll follow up with you and I'll get links to those websites and we'll add them into the show notes of this podcast. So if anybody wants to check those out for themselves as well too, we'll also add a link back to that presentation that you did at World so people can go through that. I know that you continue to do that presentation and expand on it all the time. It's important maybe that if we take the best parts of MongoDB world and we travel it around the world for lack of a better word. So we have MongoDB locals and they are coming to, hopefully a city near you. We have done Frankfurt recently, but we have Dallas and London and others coming up as well too. So go to mongodb.com/events to see what's happening there as well too. For me, this has been super eye opening. I've learned a ton and I will try to play that conscious part in understanding. Is there anything else that is... At the beginning you said the move to the cloud is probably the biggest thing anyone who's still on premise can do with their servers. Is there anything else that you should be thinking? And I suppose like any movement, if we start asking the correct questions of our organization, of our business, of our infrastructure, then hopefully we will create that shift towards something that is more sustainable and more environmentally friendly. I know inside here at MongoDB that's certainly something that we're trying to look at all the time.
Snehal Bhatia: Absolutely, yeah. I think that one thing that really stands out for me in this case is usually in life outside of the tech world and try to make more sustainable choices or greener choices, sometimes it can be much more expensive or much more inconvenient or it might not be the easiest thing to do. But in the world of tech it's essentially very interrelated to everything we're trying to do anyway, where any way trying to reduce the amount of resources we use because that optimizes for a cost. We any way trying to make our queries more efficient and our data model more efficient because that improves performance and that in turn reduces the amount of resources it consumes and so on. So what that means is that making the more sustainable choice in this case is actually the easier and the better option. It's not something that's extremely hard. It's not something that's going to have massive negative consequences or of much more difficulty. So making the sustainable choices is anyway what we all are striving towards in the world of tech. We are really in a unique position of advantage to make the most of this and make sure that we do the best for the environment. And there's lots of studies out there that are showing that the businesses that are likely to do well in the future, what they're calling them are the twin transformers and these businesses are going to find themselves at the intersection of digital technologies and sustainability. And these are studies out there saying businesses that will adopt more sustainable kind of approaches will be the ones that will actually succeed in the long run. We do have a lot of incentives for doing this even outside of just making the right choices. But given the fact that we're running against such a strong timeline, I would say that sustainability by itself is enough to justify all these efforts.
Shane McAllister: Yeah, definitely. I get you there. It's not an either or choice. It's not a binary choice. Essentially most of the time, particularly when it comes to data architecture, sustainability goes hand in hand with performance and responsiveness. As you said in your example earlier, there is no point query in a database that's in the US if the majority of your customers are in India, for example. I really love that idea that it is a win- win for everybody and that sustainability should be on top of your thinking with availability, with response times, with performance as well too. And it just needs to be a metric, right? That's it. It should be in your dashboards as another metric of how sustainable is your architecture.
Snehal Bhatia: Absolutely.
Shane McAllister: Excellent. Well, this has been fascinating, fascinating. We could probably go on a lot longer on this too, but I know we'd certainly get another episode out of you, no doubt in the months to come. Have you anything further to add or any other pointers for our listeners who'd like to learn more about this or how they go about looking at sustainable architectures?
Snehal Bhatia: Yeah, absolutely. I mean, we talk about a lot of things from an architecture perspective. I think there's other things you can think about such as leveraging. If you're an Atlas user, we have something called online archive, which is essentially a cheaper but also more optimized storage solution because it can store tons of data. Tons is not a very scientific word, but lots of data in using lesser resources. So if you have data that you don't access very frequently, but you need to keep it, you can consider automatically tiering it to an online archive kind of solution. You can start thinking about optimizing, using the right indexes, optimizing for your query patterns. Also, at the end of the day, thinking about how your software is impacting the end devices because if you are running a lot of spiky workloads and there's a lot of peaks in your workloads, sometimes it can't be avoided, but sometimes it can. If you can spread that out, it would ultimately lead to less impact on the physical hardware that's running it. So there's really a lot you can consider there. So the bits we discussed today are just the tip of the iceberg, if I may say that.
Shane McAllister: I can well imagine that, and I certainly think there's a lot more to come in this. I know where I'm based in Ireland, we have a lot of data centers. All of the major providers are here. But there is, particularly with the increase that we're seeing now in energy costs and the energy crisis that we have. There's a lot of concern that the data centers here are using, I think it was 28, 29% of the country's generation capacity. And that's a concern too. I think like anything in the environmental space, it needs the amalgamation of consumer concern, pressure, technology, and also then some leading lights to lead the way. So hopefully we will get to the point where we see that kind of eco symbol besides the digital services that you're buying and not just your air flight. But Snehal, this has been absolutely eye opening and superb. I will certainly try and get you back in the months to come to elaborate on some of this. For anybody who wants to know more, we will have links in our show notes. Please check those out. But Snehal, this has been superb. Thank you very much for joining us.
Snehal Bhatia: Thank you for having me. This was very fun.
Shane McAllister: Wow. What a fascinating conversation with Snehal and plenty to think about regardless of how much or how little data you store. Of course, with MongoDB Atlas serverless, which can meet the needs of any workload pattern and you only pay for your consumption, you can get a small headstart in having a sustainable architecture. Many thanks to Snehal for joining me. We are always looking for guests on the MongoDB Podcast. People with interesting stories to tell in the world of software development and data. So if you feel you can contribute, do get in touch at podcast @ mongdb. com and we'd love to hear from you. Speaking of interacting, during November and December, we have. local events and MongoDB days in many cities globally. Too many to mention, but if interested, you can find out more at mongodb.com/events. And the best part, all of the events are free to attend. So do check out that link to see if there's one near you. And remember to check out the show notes for further links to any of the sites mentioned in my chat with Snehal. Thanks again for listening. We really do appreciate it. If you did enjoy this episode and by any chance have not done so already, please do leave us a rating and possibly even a review on whatever podcast platform you use. It really does help us a lot and we very much appreciate it. So for me, Shane McAllister and the rest of the podcast team, until next time. Do take care and thanks for listening.
In this episode, Shane McAllister talks with Snehal Bhatia, a Solutions Architect with MongoDB, about Designing Environmentally Sustainable Architectures and the extent to which the IT industry contributes to global emissions. We discuss on premise vs the cloud, how developers can optimise the architecture of a database so it can be designed with sustainability in mind through appropriate provisioning, data shaping, indexing, queries and sharding.
Episode Links -
Connect with Snehal on LinkedIn -
Shane on Twitter @shaneymac
MongoDB Events - are all listed HERE
Want to join the podcast? We're always looking for podcast guests, so if interested, please email us at firstname.lastname@example.org