Diels–Alder 2D Potential Energy Surfaces

Transcript

Hello folks, it's Corin here. Today I want to look at two-dimensional potential energy surfaces in Rowan and how we can explore Diels–Alder cycloadditions using a variety of low-cost computational methods. So what we have here is a 2D potential energy surface as shown for this reaction between cyclopentadiene and methyl vinyl ketone.

This is a classic Diels–Alder reaction. We have a diene here, a cyclic diene that's constrained to be highly reactive, right? So it's locked in the s-cis conformation. And up here, we have an alkene, so a dienophile, that has an electron withdrawing group attached to it in the form of this methylketone that makes it a better dienophile and thus more reactive. So this is one of these classic 4 + 2 cycloadditions that you see in intro organic chemistry.

And one of the things we can ask in this… is we actually have two forming bonds here. So we have C5 and C12, as highlighted in green. This is going to be one forming bond. And we have C2 and C13 here. That's another forming bond. And we might wonder, are these two bonds going to form exactly at the same time? Or, because this molecule is electronically asymmetric, the ketone is on one side and not the other, is there going to be a little bit of a time lag here? Will one bond form before the other one does?

And to answer this question, we can make a More O'Ferrell–Jencks plot. So a two-dimensional potential energy surface where we plot one forming bond on one axis and the other forming bond on the other axis. And this essentially gives us this 2D potential energy surface shown here. So lower energy points are shown in blue and higher potential energy points are shown in yellow. So we can see our reactants up here in the top right-hand corner. And as we sort of move down to the bottom left, we form our product, which looks like this. And this is about the ground state. So both C–C bonds are about 1.5 Å, which is typically what we expect for a single C–C bond.

If we look at this, obviously there's a region of blue up here, which is sort of the reactant well. And then there's a region of dark blue down here, which is the product well. And in between, we sort of have this ridge of higher energy structures. And this corresponds to the transition state region. So sort of the lowest energy way to get across here is what the transition state is going to look like.

And so if we look, these are all relative to the products. So obviously 42 is a very high value in kcals. But relative to 23 up here, that's only 19 or so, which is a much more reasonable barrier. And we can sort of move around and try to find where the lowest energy point is. We can watch these numbers change here. What's sort of the best guess for the transition state here? We could say maybe it's something somewhere in this region. So it's relatively central, but there's maybe a slight asymmetry, you know, where this bond here is forming a little bit ahead of this bond. That seems to be right.

Now, if we sort of look in the other corners, the high energy corners, we can see that very asynchronous structures where one bond forms all the way and the other bond hasn't formed at all, these are now very high in energy. And we can imagine that this is either going to be a diradical structure or a zwitterionic structure where you have, you know, most likely a negative up here and a positive allylic cation down here. This is a lot of charge separation. This is going to be very unhappy. And same here. So this is obviously less unhappy because we can stabilize the anion with sort of form an enolate here. But it's still, this is a reaction which would much rather be concerted and not stepwise.

So this is sort of what a two dimensional potential energy surface lets us see. We can really understand these questions of bond formation, synchronicity and mechanism very clearly in this format. Now this was done using the recent OMol25 models from Meta. So specifically, this is the model trained on the OMol25 dataset following the eSEN architecture. It's a small model and it uses conservative forces. And we can see that, all told, this took about two hours to run. Now that might seem like a long time, but if we look, we can see that, you know, there's actually a ton of points here.

So every single place on this represents an individual optimization and, you know, not only do we have to run an optimization for every single point, we actually run a minimum of four optimizations per point because Rowan uses the wavefront propagation algorithm to generate extra smooth scan surfaces. The exact details of that are slightly complex and you can look them up and read the paper on your own time. But essentially what this means is we try to remove hysteresis effects by running the scan in multiple directions and making sure that we don't have sort of cryptic degrees of freedom that we're not sampling properly.

So all told, this is almost 2,000 optimizations that goes into this PES. So getting it done in two hours isn't so bad. We can compare this potential energy surface generated at the OMol25 level of theory to some older methods that are perhaps less accurate.

So this is Egret-1. This is our recent model that we released, not on the OMol25 data set. So we did this work before Meta dropped the massive 100 million DFT structures to train models on. And we can see that this reflects some of the shortcomings of the data we have. So Egret-1 is just trained on neutral structures. And in the regions where things look neutral, so sort of down this central axis, I'd say things look very reasonable. So this structure looks great. Even the transition state structure looks very good. We can see, again, it's lower in energy if it's slightly asynchronous. So we're capturing that correctly.

But what we do see here is that there's these weirdly low energy structures off in the corners. In the Meta model here, these are very high in energy. This is very high in energy. Whereas with Egret-1, this is high in energy here, this sort of close contact. But over here, it's lower in energy. And I think this reflects the fact that Egret-1 actually isn't trained on any structures that have charge. And so Egret-1 doesn't know about carbocations or carbanions. It's not really handling charge correctly. And so these sort of very asynchronous mechanisms where you have charge buildup are not well described.

You know, this takes about the same amount of time. This actually took slightly longer. Wouldn't be surprised if some of these optimizations took a while to converge. And now on the other end of the spectrum, xTB, which is a semi-empirical method, again, the geometries look quite good. But now you can see the whole barrier height and the whole transition state region is washed out. It's very, very smooth. And the surface overall is very hard to understand. And there's not as clearly defined of a transition-state region. And this, I think, reflects the idea that, you know, while xTB often generates very good geometries, the energies are not particularly accurate. So you almost always want to re-score xTB methods for thermochemistry with a different method, just because the energies are not super accurate.

And so I think seeing these three potential energy surfaces right next to each other really illustrates what a step forward the OMol25-trained models are. You know, they're able to generate something that's super reasonable, that sort of reproduces the chemical intuition that we have, and does so at a fraction of the cost that a conventional DFT calculation would take. Just for comparison, this is what a DFT calculation looks like. I stopped this after two and a half hours, and we'd only made it through a handful of scan points here. So this is really hard to run 2,000 DFT optimizations. It takes a lot of compute, it takes a lot of time, and in the time it takes you to run just 10 or 20 optimizations you could have, you know, almost completed the entire potential energy surface and very likely at a higher accuracy than that method anyway.

So if we want to dig in just a little bit more, you know, what we can do is we can actually take a point from here, you know, in the transition-state region and resubmit it as a transition-state optimization. So I did this, this was the starting structure here. And then you can, you know, click run and it did a 44-step transition-state optimization to generate this structure.

And so this is now ostensibly a converged first-order saddle point on the potential energy surface. And if we go over here to frequencies, we can actually confirm that yes, there's a single imaginary frequency, just like we expect for a transition state. And it does match exactly what we want. So it shows two bonds being formed at the same time. So this is indeed a productive transition state. And we can see that this took under a minute, so only 46 seconds to find this transition state. So you don't even have to change tabs. It just runs right in front of you. It's very nice to see and it's very satisfying.

So this is, you know, this, has the geometry you want. We can see there's sort of like some, you know, endo interactions here. There's a nice pucker in the ring. It looks great. This looks beautiful. And if we want to go just one level deeper and studying the reactivity of the system, we can now resubmit this as an intrinsic reaction coordinate job, which I've done here. And so this takes the transition state and then follows the minimum-energy path backwards and forwards to sort of show what this reaction will look like along the minimum-energy path.

So if we follow it backwards, we can see the two molecules move apart and sort of float off into space. OK, it's about what we expect. And now as we watch them come together, we can see that in the actual post-transition state region, we can see first one bond form here. So this is the bond we'd expect to form first. We generate, you know, probably some delta minus charge here and some delta plus charge here. So maybe a little enolate character here and a little allyl cation character here. And then the other bond snaps into place just a second later. So there is a little bit of asynchronicity to this reaction as is expected, but the two bonds form in very close succession.

And we can even plot this here. So we can, you know, plot this bond length as a function of the reaction coordinate and see it smoothly descend. And then we can plot this bond length and see it also descend, but just slightly slower. And you know, if we want, we can also plot something like this. So we see here this double bond go from, you know, 1.33 Å to 1.55 Å, showing it elongate. We can plot this bond here that goes from, you know, sort of a single bond here to a double bond over the course of the reaction. You know, we can plot the change in the C-O bond length as we sort of pick up some enolate character around the transition state. You know, this will lengthen a little bit. And this bond will shorten a little bit and then lengthen a lot.

So there's all sorts of fun things we can do here watching sort of bonding and conjugation change over the course of the reaction. And of course, we can always resubmit these structures and run charge calculations also if we want to. I think my favorite thing to do with these IRC calculations, though, is just to play them through and really sort of watch the reaction happen, just to get a sense for what this looks like.

I think one of the most underrated aspects of computational chemistry is just sort of the way that it brings these otherwise abstract atomic reactivity concepts to life. So anyhow, thanks for watching. I hope you learned something from this video. And if you want, you can make a Rowan account for free and start playing around with these excellent OMol25-based models from Meta. They're very accurate and they've been good at almost everything we've tried them on.