Lecture 1
During the lecture, please copy the following snippets of code into Spyder, and follow along with the presentation.
You can download the lecture notes here: MSE 593 Lab 1 Lecture Notes
Part 1: Plotly Figure Example
See the Band Gap vs Density File here: BandGapvsDensity.html
Part 2: Materials Project API Query
Some relevant links:
Materials Project API: https://materialsproject.org/docs/api
Python Data Structures: https://docs.python.org/3/tutorial/datastructures.html
Materials Project features available via API: https://github.com/materialsproject/mapidoc/tree/master/materials
MongoDB Query Language: https://docs.mongodb.com/manual/reference/operator/query/
from pymatgen.ext.matproj import MPRester
MPR = MPRester("2d5wyVmhDCpPMAkq")
criteria = {'elements':{"$in":["Li", "Na", "K"], "$all": ["O"]}, #All compounds contain O, and must have Li or Na or K
'nelements':3,
'icsd_ids': {'$gte': 0},
'e_above_hull': {'$lte': 0.01},
'anonymous_formula': {"A": 1, "B": 1, "C": 3},
"band_gap": {"$gt": 1}
}
# The properties and the criteria use MaterialsProject features
# You can see what is queryable from the MP API documentation:
# https://github.com/materialsproject/mapidoc/tree/master/materials
# The criteria uses mongodb query language. See here
# for more details: https://docs.mongodb.com/manual/reference/operator/query/
props = ['structure', "material_id",'pretty_formula','e_above_hull',"band_gap","band_structure"]
entries = MPR.query(criteria=criteria, properties=props)
print(len(entries))
for e in entries:
print(e['pretty_formula'])
print(e['band_gap'])
print(e)
break
To save your query locally:
import os
import json
from pymatgen.ext.matproj import MPRester
directory = os.path.join(os.path.dirname(__file__))
MPR = MPRester("2d5wyVmhDCpPMAkq")
def get_entries():
cache = os.path.join(directory, 'example_query')
if os.path.exists(cache):
print("Loading from cache.")
with open(cache, 'r') as f:
return json.load(f)
else:
print("Reading from db.")
criteria = {'icsd_ids': {'$gte': 0},'e_above_hull': {'$gte': 0},
"nelements":3,'anonymous_formula': {"A": 1, "B": 1, "C": 3}}
props = ["material_id",'pretty_formula','e_above_hull',"band_gap","band_structure"]
entries = MPR.query(criteria=criteria, properties=props)
with open(cache, 'w') as f:
json.dump(entries, f)
return entries
entries=get_entries()
print(len(entries))
Part 2: Making the Band Gap vs Density figure
import os
import json
from pymatgen.ext.matproj import MPRester
from pymatgen.core.structure import Structure
import plotly.express as px
from matplotlib import pyplot as plt
from pymatgen.core.composition import Composition
from pymatgen.core.periodic_table import Element
current_dir = os.path.join(os.path.dirname(__file__))
MPR = MPRester("2d5wyVmhDCpPMAkq")
def get_bs_entries():
## Many queries are very large, so this python
# method either queries the MP and saves it in the 'cache' file,
# or if the cache file exists, it loads it directly from the cache.
cache = os.path.join(current_dir, 'ternox_band_gap_data')
if os.path.exists(cache):
print("Loading from cache.")
with open(cache, 'r') as f:
return json.load(f)
else:
print("Reading from db.")
from pymatgen.ext.matproj import MPRester
MPR = MPRester("2d5wyVmhDCpPMAkq")
criteria = {'has_bandstructure': {'$eq': True},'elements':{'$all': ['O']}, 'nelements':3, 'e_above_hull':{'$lte':0.05}}
# The criteria uses mongodb query language. See here for more details: https://docs.mongodb.com/manual/reference/operator/query/
props = ['structure', "material_id",'pretty_formula','e_above_hull',"warnings","band_gap","band_structure"]
#The properties and the criteria use MaterialsProject features
#You can see what is queryable from the MP API documentation: https://github.com/materialsproject/mapidoc/tree/master/materials
entries = MPR.query(criteria=criteria, properties=props)
print(len(entries))
#Save files are prepared in a 'JSON' file.
#Some MP objects are not JSONable, and so they must be turned into a dictionary before they can be saved.
new_entries=[]
for e in entries:
X=e
X['structure']=X['structure'].as_dict()
new_entries.append(X)
with open(cache, 'w') as f:
json.dump(new_entries, f)
return entries
entries=get_bs_entries()
print(len(entries))
import pandas as pd
D={'atomic_volume':[],'band_gap':[],'mpid':[],'formula':[],'name':[], 'diff_electroneg':[]}
# Pandas is a generalized Python data storage platform, sort of like Excel.
# What we are doing here is creating 'columns' for this dataframe,
# And then we are generating the data to put into this column.
# Some features can be saved directly, such as band_gap. However, other
# ones we have to code manually, for example, atomic volume.
for e in entries:
comp=Composition(e['pretty_formula'])
#If we are doing atomic volume, H-containing oxides have spuriously low volume since they often form OH anions.
if Element("H") in comp.elements: continue
s=Structure.from_dict(e['structure'])
atomic_volume=s.volume/len(s)
D['atomic_volume'].append(atomic_volume)
D['band_gap'].append(e['band_gap'])
D['mpid'].append(e['material_id'])
D['formula'].append(e['pretty_formula'])
A=sorted(comp.elements, key=lambda el: el.X)[0] # This sorts the elements by electronegativity and takes the first element
B=sorted(comp.elements, key=lambda el: el.X)[1] # Open pymatgen.core.periodic_table to see more Elemental Features
D['diff_electroneg'].append(B.X-A.X)
name=e['material_id']+': '+e['pretty_formula']
D['name'].append(name)
df = pd.DataFrame(D)
#Plotly is an interactive data platform so that we can hover over datapoints and explore further.
import plotly.express as px
import plotly
fig=px.scatter(df,x="atomic_volume",y="band_gap",color='diff_electroneg',hover_name='name')
plotly.offline.plot(fig, filename='BandGapvsDensity.html')
Relevant Links:
Pandas in 10 minutes: https://pandas.pydata.org/pandas-docs/stable/getting_started/10min.html
Object-oriented programming: Bank Account Example
Pymatgen Composition object: https://pymatgen.org/pymatgen.core.composition.html
Figure Galleries:
Plotly Interactive Figures: https://plot.ly/python/
Matplotlib: https://matplotlib.org/gallery.html
Seaborn: https://seaborn.pydata.org/examples/index.html
Bokeh: https://docs.bokeh.org/en/latest/docs/gallery.html
Lecture 2
Part 1: Convex Hulls for Stability Analysis
from pymatgen import MPRester
from pymatgen.analysis.phase_diagram import PhaseDiagram, PDPlotter
#This initializes the REST adaptor. Put your own API key in.
MPR = MPRester("2d5wyVmhDCpPMAkq")
#Entries are the basic unit for thermodynamic and other analyses in pymatgen.
#This gets all entries belonging to the Ca-O system.
entries = MPR.get_entries_in_chemsys(['Ca', 'O'])
#With entries, you can do many sophisticated analyses,
#like creating phase diagrams.
pd = PhaseDiagram(entries)
plotter = PDPlotter(pd)
plotter.show()
To read about all the functionalities in the MP API Rest Interface:
https://pymatgen.org/pymatgen.ext.matproj.html
To read about Object Oriented Programming in the Bank Account:
To read about the Composition Object
https://pymatgen.org/pymatgen.core.composition.html
https://github.com/materialsproject/pymatgen/blob/master/pymatgen/core/composition.py
To read about computed entries, and the phase diagram:
https://pymatgen.org/pymatgen.entries.computed_entries.html
https://pymatgen.org/pymatgen.analysis.phase_diagram.html
How to learn Pymatgen?
Work on a real problem. Don’t plan to learn the whole thing at the outset. Instead, search for what you need to use, and learn pymatgen one function at a time.
Look for what you need in the documentation: https://pymatgen.org/pymatgen.html#subpackages. It will give you some idea on how to get started.
Read the source code for the relevant method if you need more details: https://github.com/materialsproject/pymatgen
Slides
Lecture 3
Developing new descriptors using Pymatgen
Get Symmetrically Equivalent Miller Indices code
Paper that today’s lecture is based off of
Lecture 4
Two tips on navigating Pymatgen code:
Tip 1: The developer of Pymatgen, Professor Shyue Ping Ong (UCSD), has a website with some sample notebooks. These give you tips on how to run a few common Pymatgen functions. It is not complete, but these notebooks are very instructive.
http://matgenb.materialsvirtuallab.org/
Tip 2: Another place to read sample code is in Python UnitTests. Whenever Pymatgen is updated with new code, it runs a series of tests to ensure that Pymatgen remains self-consistent. For example, if someone updated the Composition object with a bug, it could possibly break the Phase Diagram code somewhere else. UnitTests are designed to check that the code base is not broken.
However, UnitTests also offer an advantage that they provide sample code on how to execute various pymatgen scripts, and also, it tells you what the ‘answer’ is supposed to be, from the ‘AssertEquals’ methods.
For example, here is a snippet from the test_xrd.py code
import unittest
from pymatgen.core.lattice import Lattice
from pymatgen.core.structure import Structure
from pymatgen.analysis.diffraction.xrd import XRDCalculator
from pymatgen.util.testing import PymatgenTest
import matplotlib as mpl
class XRDCalculatorTest(PymatgenTest):
def test_get_pattern(self):
s = self.get_structure("CsCl")
c = XRDCalculator()
xrd = c.get_pattern(s, two_theta_range=(0, 90))
self.assertTrue(xrd.to_json()) # Test MSONAble property
# Check the first two peaks
self.assertAlmostEqual(xrd.x[0], 21.107738329639844)
self.assertAlmostEqual(xrd.y[0], 36.483184003748946)
self.assertEqual(xrd.hkls[0], [{'hkl': (1, 0, 0), 'multiplicity': 6}])
self.assertAlmostEqual(xrd.d_hkls[0], 4.2089999999999996)
self.assertAlmostEqual(xrd.x[1], 30.024695921112777)
self.assertAlmostEqual(xrd.y[1], 100)
self.assertEqual(xrd.hkls[1], [{"hkl": (1, 1, 0), "multiplicity": 12}])
self.assertAlmostEqual(xrd.d_hkls[1], 2.976212442014178)
Note that PymatgenTest is an imported object for this XRDCalculatorTest class. The PymatgenTest object can be found here: https://github.com/materialsproject/pymatgen/blob/master/pymatgen/util/testing.py. You can see that it is pulling the CsCl file from the Test_Files directory
The code then shows you how to run a calculated XRD pattern. Next, the self.assertEqual tags tell you what the output data structure should look like.
Every folder directory on Pymatgen has a ‘tests’ folder. The UnitTests can be found in there. Usually, these UnitTests run through a full Python object, along with the various functions and properties. By reading the tests, you can get a strong feel for what each Pymatgen object can do.
Here is an example of the tests directory for the analysis package. https://github.com/materialsproject/pymatgen/tree/master/pymatgen/analysis/tests
Other APIs
Online documentation is not always great. I have tried to give you a set of skills for navigating APIs and other online databases. These skills should be generally transferrable.
Try navigating the NREL high-throughput experimental library:
GUI for https://htem.nrel.gov/
Github repository with some example API scripts for NREL HTEM Database
Youtube Tutorial Video for using HTEM API
In the upcoming weeks, I will give you a few links to other APIS:
MPDS Pauling Files Database (Still working on getting a subscription a UMich)
Tutorial for the MPDS platform
AFlowLib (Another high-throughput materials database):
Other general free APIs