Studying Azide–Alkyne Cycloaddition With the Freezing-String Method

Transcript

Hi folks, it's Corin here from Rowan. And today I'm going to show how Rowan's new double-ended transition-state-search workflow can be used to rapidly study reactivity. The reaction I want to study today is the azide–alkyne cycle addition. And specifically, this is a copper-free variant of it that uses cyclooctynes to enable this reaction to happen so fast that it doesn't even need the copper catalyst.

The reason this is useful is that both of these functional groups, azides and alkynes, are not reactive on their own in the body. But when put together, they do react. And so this makes this reaction "bioorthogonal." Basically, you can put one functional group on a lipid, you can put another functional group in the cytosol or on a protein, and then you can have them react when they get near each other. So this becomes useful for assembling structures in situ in biological contexts. Carolyn Bertozzi and other chemical biologists have done a lot of interesting work with this chemistry.

And the reason this works is that you use a strained sort of super-reactive alkyne, like some of these structures, that just sort of gets things done. You know, it boosts it a little bit more. So you have this extra, you know, sort of bending strain on the alkyne that makes it more reactive, but not so unstable that it reacts with just anything. It's still selective for the azide.

In particular, I wanted to look at these structures here. You know, this reactivity was studied a while ago by Houk and Bertozzi in this paper from 2012, and I thought, you know, this will be a good model system to show how the double-ended transition-state-search workflow in Rowan can work. You know, in particular, these sorts of cycloadditions can be a little bit difficult to find with just a single scan, because you actually want to scan along two bond distances at once. So this is kind of a natural choice for showing something that would be hard with existing tools that's now a little bit easier.

So if we go to Rowan, we start out on our projects page and actually make a new project just for this. So we could share this separately with people or manage controls or organize our data more efficiently. And now we sort of have this blank canvas. So what we're going to do here is we're going to go to the transition-state-search overview page. We're going to see that we have a blank space for both our reactant and our product files. So here I'm going to say draw 2D and we'll just start drawing our product out.

So we have a ring here we're going to want to make this a triple bond. We'll put another ring in. We will want to have one carbon, one of another carbon, this will connect here. Oh, that's not right. So we need to change this now. There we go. We'll want to make this a double bond. You'll want to make this an N with a methyl. And then actually I realized this is the reactant, but we want to be drawing the product here, so in the product box. So we'll say, this will be it. We're all over the place here. So we'll want to make this a double bond. I want to make this a single bond. Make this an N, this an N, this an N. Then we'll have that. Okay, great.

So we'll save that. Does that look like the product we want? It is. Fantastic. Okay, so we'll notice that there's a little bit of an issue here. So we see this little yellow alert box and we say this structure is a 2D structure, but this workflow expects 3D structures. Click the button with the cube icon to convert the structure. So that's right, we've drawn this in 2D, but we need a 3D structure to do a transition-state search. So we'll just click convert to 3D and everything works pretty much as we expect. Awesome.

So now what we're going to do is we're going to start with our product and we're gonna go up to reactants. We're going to put this in and we'll say "Paste XYZ. "So we'll just duplicate the structure for the reactant. But now in this case, this is obviously not the reactant, this is the product again. So we need to edit it to unform all these bonds we've made.

So first I'll click here, we'll say default remove bond, click on this, default remove bond. And now what we'll do is just, we'll just push these away from each other. So we'll say, get away from me. And so now what we've done is we've broken this bond. We've just created a pre-reactive complex here. It doesn't need to be minimized. We don't need to do anything fancy here. We just need to look at this and say, OK, yep, this looks like we're ready to go. OK, and so now we'll say, BARAC plus methyl azide. We'll stick with the OMol25-eSEN-conserving-small NNP for this study, which is quite a good NNP for this. We will turn on "Optimize Inputs," but we'll turn off "Optimize transition state," and we'll run that transition-state optimization separately to make it little easier to show in the video. And it'll say, submit double-ended TS search.

All right, so this is going to take a second because we're going to allocate some cloud GPUs for this. But what this is essentially going to do is first minimize the input and output structures and then run a freezing-string-method calculation. And so what the freezing string method is going to do is it's going to iteratively add points bridging between the reactant and product structures and minimize them orthogonal to the reaction of the reaction, to the direction of the reaction coordinate, which will try to find the best barrier connecting these structures and then the highest point along that pathway will look a lot like a transition state. So we'll do this and we're just gonna have to wait a couple minutes while this runs.

So I've just jumped forward a couple of minutes just so we didn't have to wait for GPUs to be allocated. And the calculation is now running. So we can see as we have a starting structure here which has now been minimized. So the molecules are a little closer together and we have an ending structure over here, which is obviously the triazole that we wanted to form. And what we do now is that we can see the points being added to the start and end of the string. So as we move along this way, the molecule gets a little closer together. As we move back here, the bonds start to break. And there's obviously still this big gap right between them. And somewhere in that gap is going to be our transition state. So that's what we're looking for.

These calculations are running pretty fast. So every new point runs a new optimization. But the fun thing about neural network potentials is instead of waiting a couple hours for each optimization, this only takes 10 or 15 seconds. So we can watch the string grow in real time. Okay, so now we can see, yep, still coming together here. If we look at the relative energy down here as we move along the string with our arrow keys, we can see that it is going up, but it's not going up a ton yet. And down here we can see that the, notice they scan backwards from the product, unless it's not technically a scan, ah we can see that the energy also is rising. So just waiting for that transition state here.

Alrighty, you know, this is starting to look like something that sort of resembles the transition state. This isn't a bad guess for the transition state yet, although, yeah, you know, that might actually be it. So we can look and we can see, yes, this calculation is actually completely finished. So we have, yeah, our forward string and our backward string. They met in the middle and we can look and say, okay, the highest energy point is actually right here. So it's this 107 kcal/mol above the products, which is crazy. That's so much. That really illustrates why this is, you know, a click reaction. There's so much potential energy sitting in these steps.

Okay, so now we can take this and we can say, let's resubmit this as a transition state guess. And so we'll say resubmit current structure. We'll go to calculation. We'll say optimize TS and frequencies. And we'll say "TS from FSM." So this is the highest energy point on the FSM. We're going to submit this to a transition state optimization. So this starts running right away because this is a slightly lighter weight calculation and doesn't take quite as much compute. Yeah, and we can see this is again going to be using OMol-eSEN-conserving-small. It shows where this is resubmitted from and it will start to show optimization steps as this proceeds. If we look back at this FSM calculation, you know, the whole thing took only three, almost four minutes. So three minutes, 52 seconds, which is pretty fast.

So this is fast enough that you can run, you know, a good number of these. If we wanted to do a scan, you know, for different substitution on the azide or different electronic properties in the arenes or even, you know, deleting the arenes altogether or adding different functional groups here, it would be straightforward to do this for a lot of different cases. And so that's quite nice. I think, you know, this is truly like a high throughput ready automation tool and not something that's so expensive that you can only run one at a time and sort of, you know, count your shots, so to speak.

Okay, so let's see what progress we made here. Oh, okay, 44 steps. And so what we can see is that we were pretty close to the transition state. So if we graph, you know, what the C–N distance here is, yeah, you know, we've moved up a little bit here. Yeah, we've moved it down a little bit. So there's a little wiggling back and forth that's been done, but this is, think, a reasonable guess for the transition state. You can also do these sorts of geometric analyses here. So for instance, we want to see how this bond distance changes. ah It changes roughly as we expect, but that can be nice to see. If we want to see how this bond angle changes, that's sometimes kind of fun.

OK, so now this is done. So we have a single imaginary frequency. If we look at this, this does actually correspond to the movement we expect for the cycle addition transition state. You know, this bond is lengthening a little bit. The angle is decreasing here. These atoms are moving towards each other. So that seems good. 300 cm-1 is a pretty good imaginary frequency. It's not one of these like 20 cm-1 imaginary frequencies that usually corresponds to some methyl rotation or something. And yeah, if we look, this calculation took only 90 seconds. So that's also pretty good.

So now the last step I want to do in this video is to resubmit this current structure as an intrinsic-reaction-coordinate calculation. And we can think of this as essentially the freezing-string method but backwards. So here we know our transition state, but what we want to find is how it goes down on both sides. So say "IRC from TS." We're not going to re-optimize the TS because we literally just found it. And we'll keep the default parameters here. And we're just going to submit this.

So the FSM is a little bit messier than an IRC. So with the FSM, you don't know the transition state. You're sort of growing the string optimistically from the reactants and products trying to find a saddle point. Here, we can say we know the saddle point and we want to see what it looks like on both sides. And so this lets us actually look at how the reaction happens and will let us confirm that the transition state we found is "correct," that it actually will form the separated reactants on one side and will form the product on the other side.

And again, this is all automatically done on Rowan via GPU. We pick a GPU that fits the size of the system and the neural network potential you've chosen so that you have enough VRAM, it runs quickly, et cetera, et cetera, et cetera. So this should be running quite fast and we should see some steps coming in within the next minute or so.

Okay, so we're just waiting around here a little bit for the IRC to start running.I believe what's happening right now is that there's a Hessian calculation ongoing. Much like with a transition state optimization, an IRC needs an initial second-derivative matrix to give accurate results. That's just sort of how it works. It's a second-order method. And so what this means is that we actually go through and build the matrix of mixed nuclear second derivatives which, again, doesn't take days like it does for DFT, but can take a little bit longer than just a routine optimization. And so we just have to be a little patient here.

That is, you know, something I've been spoiled with through Rowan is that I kind of expect all my calculations to run immediately. And if I have to change tabs or wait around, I get a little miffed, um, as opposed to in a previous life where I had to only do everything with DFT, I was just used to submitting things and checking back the next day. It's crazy how things change in just a few years.

Okay. So now we're seeing the optimization steps pour in. And this, this is quite fast, right? I suppose they're IRC steps and not optimization steps. And if we look, this direction just so happens to be the direction towards reactants and we can see Yeah, if we start from here It does actually go towards reactants. So that's good that that suggests that this is actually the process we think it is. And then once we finish this side we'll go into the other side and we will hopefully—fingers crossed—see that this forms products.I didn't run any of these calculations to check that they actually worked before the video. Thisis a one and done kind of tutorial. So it's entirely possible that this will fail, which would be funny.

Things seem to be pretty good here. We can look at how this angle is changing. We can see, yep, it's linearizing. So it starts at about 150 º in the TS and it's going towards 180 º. We can look at how this bond length is changing and we can see, yep, it's shrinking. So it starts out as sort of a partial double bond character here and now becomes more of a triple bond. So going from 1.23 to 1.20 Å. I don't know what else we can look at. I mean, we can look at all sorts of things. We can look at how this bond length is changing.

That's kind of interesting. I don't quite know how to think about that. I'm saying here does shrink. I mean, it's a small change, but it's not a negligible change. Okay. Yeah. And now here's the other side of our RSC. So if we start our TS and we go this way, we see that the bond is actually forming. And so while this isn't quite finished yet, what this RSC is showing us is that yes, this transition state that we're starting from is correct. This is indeed representing the correct azide–alkyne cycloaddition and that we've correctly found the transition state for this process.

So if we wanted to do further analysis on the system, there's a lot of things we could do. So one of them is just look at the difference in energy between this transition state and the transition state where the methyl group is over here. In other words, try to predict the regioselectivity of this process. You know, will we preferentially form one regioisomer over another? I actually don't know if we will. That'd be interesting to find. And we could compare to experimental data to see how well this NNP or any other NNP does.

Another thing we could do here is we can compare different substitution here, different substitution here to look at the transition state energies. And if we want to actually get a ∆G‡ prediction out of this, what we'd want to do is separately optimize each of the reactant structures and then compare their separated Gibbs free energies to the Gibbs free energy of the transition state. And then using the Eyring–Polyani equation, we could get a rate estimate from these calculations.

OK, IRC's just finished. Again, in total, this took five minutes. And so if we were feeling particularly high throughput, we could probably skip this step and just look at this and say, yeah, this looks like a good enough process. But the IRC does let us actually confirm that this is the correct reaction. And now if we flip the x-axis, we can see this happen all the way through. And there it is, the 3+2 cycloaddition between methyl azide and the strained cyclooctane.