The GitHub API opens up an exciting world of possibilities for automating workflows, integrating with GitHub, managing projects, and analyzing data. As Python developers, we can take full advantage of the API to boost our productivity and create useful tools.
In this comprehensive, step-by-step guide, you‘ll learn how to use the GitHub API using Python.
Why Use the GitHub API with Python?
Before jumping into the code, let‘s look at why using the GitHub API with Python is so powerful:
-
Automate workflows – Eliminate repetitive tasks by writing scripts to create issues, open and merge PRs, release binaries, etc.
-
Enhance productivity – Integrate custom tools into your dev environment to improve workflows.
-
Manage projects – Programmatically manage issues, labels, milestones across repositories.
-
Analyze data – Mine interesting metrics and insights from the over 96 million repos.
-
Integrate and extend GitHub – Create custom web apps, visualizations, CLI tools, bots, and more!
The API opens up many creative ways to boost productivity and build great developer tools and experiences.
Overview of the GitHub API
The GitHub API provides RESTful endpoints to access GitHub data and services. You can:
- Manage repositories, gists, issues, pull requests
- Interact with Git data – commits, branches, tags
- Retrieve user profiles, organizations, teams
- Search code, issues, repositories, users
- Access metadata, issues, PRs, files, commits
- Analyze community trends, project forks
And much more!
The API uses JSON to serialize data and uses OAuth for authentication. All requests must be made over HTTPS.
To use the API, you simply:
- Create a GitHub account
- Generate a personal access token for authentication
- Make API requests and handle responses
Now let‘s see this in action with Python examples!
Making GitHub API Requests
Python‘s requests
library makes it easy to interact with web APIs. Let‘s fetch some GitHub user data:
import requests
username = "defunkt"
response = requests.get(f"https://api.github.com/users/{username}")
print(response.json())
This prints information like:
{
"login": "defunkt",
"id": 2,
"node_id": "MDQ6VXNlcjI=",
"avatar_url": "https://avatars.githubusercontent.com/u/2?v=4",
"gravatar_id": "",
"url": "https://api.github.com/users/defunkt",
"html_url": "https://github.com/defunkt",
//...
}
We can access any unauthenticated API route this way. To access private data, we need to pass an authentication token.
Creating a GitHub Personal Access Token
To generate a token:
- Go to Settings > Developer settings > Personal access tokens
- Click Generate new token
- Give it a description like "My Python Script"
- Select the scopes/permissions you want
- Click Generate token
Be sure to copy the token – you cannot retrieve it again later!
Common scopes include:
repo
– Access private repositoriesadmin:org
– Manage organizationsnotifications
– Access notificationsuser
– Read/write access to profile info
Let‘s use our token to create a new repo:
import requests
token = "ghp_123abcMyToken"
data = {"name": "My New Repo"}
response = requests.post(
"https://api.github.com/user/repos",
json=data,
headers={"Authorization": f"token {token}"}
)
print(response.status_code) # 201 = Success!
The token authenticates us to create private repositories.
You can also use GitHub Apps which have scoped access and don‘t expire like user tokens. Apps must be installed by a user/org to gain access.
Working with GitHub Repositories
A major part of the API involves managing repositories. Let‘s go through some common repository tasks.
Get a Repository
To get a repository‘s metadata:
response = requests.get("https://api.github.com/repos/pandas-dev/pandas")
repo_info = response.json()
print(repo_info[‘description‘])
# Powerful data structures for data analysis, time series, statistics
We can access info like descriptions, stars, clones, contributors, languages, releases, commits, and much more!
List Repositories
To list repositories for a user or organization:
repos_url = "https://api.github.com/users/octocat/repos"
repos = requests.get(repos_url).json()
for repo in repos:
print(repo[‘name‘]) # Prints names of each repo
Create a Repository
We can also create new repositories:
data = {
"name": "My New Repo",
"description": "This is my cool new repo",
"private": False
}
response = requests.post(
"https://api.github.com/user/repos",
json=data,
headers={"Authorization": "token {token}"}
)
Delete a Repository
To delete a repository:
requests.delete(‘https://api.github.com/repos/octocat/Hello-World‘,
headers={"Authorization": "token {token}"})
This gives you full control over programmatically managing your repositories.
Note: All API requests must be made using HTTPS for security.
Working with Issues in Repositories
The Issues API allows managing issues and pull requests. You can:
- List/create/edit/close/reopen issues
- Lock conversations, merge PRs
- Submit and edit comments
- Add labels, assignees, milestones
For example, to get issues from a repository:
response = requests.get(‘https://api.github.com/repos/octocat/hello-world/issues‘)
issues = response.json()
for issue in issues:
print(issue[‘title‘]) # Prints each issue title
This allows you to integrate issue management into external tools and workflows.
Working with Git Data
The Git Data API provides endpoints to interact with Git repositories directly. You can:
- Manage branches and tags
- Read/write blob data
- Retrieve commits, references, trees
- Compare commits, references, files
For example, to get commits from a repo:
commits_url = "https://api.github.com/repos/pandas-dev/pandas/commits"
commits = requests.get(commits_url).json()
for commit in commits:
print(commit[‘sha‘]) # Print commit SHAs
print(commit[‘commit‘][‘message‘]) # Print messages
This provides complete access to programmatically manage Git repositories.
Searching for Repositories and Code
GitHub‘s search API allows querying for almost anything across the over 96 million public repositories.
For example, to find Python projects related to data science:
import requests
query = "language:python data science in:readme"
response = requests.get("https://api.github.com/search/repositories",
params={"q": query})
results = response.json()[‘items‘]
for result in results:
print(result[‘name‘]) # Prints names of matching repos
The search query syntax supports Boolean operators, filters, context selection, and more to craft targeted searches.
Some examples:
org:facebook language:python stars:>5000
– Python repos in Facebook org with over 5k starsfilename:requirements.txt django
– Repos with requirements.txt containing Djangouser:defunkt location:san francisco
– Find defunkt‘s repos if location is SF
The Search API opens up many creative ways to mine interesting data sets and insights from across GitHub‘s open data.
Using GitHub‘s GraphQL API
In addition to the REST API, GitHub provides a GraphQL API for more flexible queries.
GraphQL allows you to specify precisely the data you want in nested JSON structures. You can query multiple linked entities in one request.
For example, here we query a user‘s profile data as well as their repository names:
import requests
query = """
query {
user(login:"defunkt") {
name
repositories(first:10) {
nodes {
name
}
}
}
}
"""
response = requests.post(‘https://api.github.com/graphql‘, json={‘query‘: query})
print(response.json())
This allows shaping the exact response you need. The GraphQL Explorer helps build queries interactively.
Integrating the GitHub API into Apps
Now that you know the basics, let‘s look at building applications with the GitHub API.
OAuth App Authorization
For web apps, use GitHub OAuth for authorization instead of hardcoded tokens. This allows users to revoke access.
- Register a new OAuth App
- Use the Client ID and Secret for authorization
- Redirect users to request GitHub access
Now your app can make API calls on behalf of users.
Making Authenticated Requests
Once authorized, make calls with the access token:
access_token = "abc123xxddff" # OAuth token
response = requests.get(
"https://api.github.com/user/repos",
headers={"Authorization": f"token {access_token}"}
)
print(response.json()) # Print user‘s private repos
This lets you access private data based on the user‘s permissions.
Rate Limiting
The GitHub API has rate limits on requests. Monitor your app‘s status:
response = requests.get("https://api.github.com/users/octocat")
print(response.headers[‘X-RateLimit-Limit‘]) # 5000
print(response.headers[‘X-RateLimit-Remaining‘]) # 4999
Spread requests over time and cache data to avoid limits.
Handling Errors Gracefully
Always check status codes and handle errors properly:
response = requests.get(‘https://api.github.com/invalid/url‘)
if response.status_code == 404:
print("Resource not found!")
elif response.status_code == 403:
print("You do not have access!")
else:
print("An error occurred.")
This ensures your app remains stable in production.
By following API best practices, you can build robust integrations and tools for developers.
Building a GitHub Dashboard App
Let‘s tie together what we learned by building a web app for viewing your GitHub profile and repos using Flask:
# app.py
from flask import Flask
import requests
from github import Github # pyGithub library
app = Flask(__name__)
@app.route("/")
def dashboard():
# Use access token for API requests
github = Github("access_token123xxdd")
# Fetch user profile info
user = github.get_user()
# Fetch list of repos
repos = user.get_repos()
# Pass info to template
return render_template("dashboard.html", user=user, repos=repos)
if __name__ == "__main__":
app.run(debug=True)
We use pyGithub to simplify some API interactions. The homepage will render the dashboard.html
template:
<!-- dashboard.html -->
<h3>GitHub Dashboard for {{user.name}}</h3>
<img src="{{user.avatar_url}}" style="width:64px">
<h4>Your Repositories</h4>
<ul>
{% for repo in repos %}
<li>{{repo.name}}</li>
{% endfor %}
</ul>
This shows how you can build an app to display GitHub data for a logged in user!
The possibilities are endless for integrating the API into your own apps and tools.
Best Practices when using the GitHub API
Here are some best practices to ensure your applications using the GitHub API are performant, secure, and robust:
- Authentication – Use tokens or OAuth, avoid sending raw username/passwords.
- HTTPS – Always use HTTPS endpoints to secure data.
- Rate Limiting – Spread out requests and cache data to avoid limits.
- Pagination – Use page parameters to iterate through result sets.
- Error Handling – Handle 4xx and 5xx errors gracefully.
- Testing – Thoroughly test API calls, use mocking for iterations.
- Documentation – Read docs closely, they provide code samples for each endpoint.
Following API best practices prevents avoidable mistakes and ensures reliable apps.
Other GitHub API Features to Explore
We‘ve only scratched the surface of what‘s possible with the GitHub API. Here are some other cool features to check out:
- GitHub Actions API – Automate workflows by triggering Actions with the API
- GitHub Pages – Programmatically manage Pages sites
- Gists – Manage code snippets, configs, and templates
- Organizations – Manage org teams, members, and permissions
- Git Database – Directly access Git object data like blobs and trees
- GitHub Marketplace API – Manage apps listed in GitHub Marketplace
- GitHub Discussions API – Build community forums and Q&A integrations
The API capabilities expand as GitHub adds new features, so keep an eye out for new endpoints.
Compare GitHub‘s API to Alternatives
For developers working with other platforms, how does GitHub‘s API compare to competitors like GitLab, BitBucket, Azure DevOps, etc?
Overall GitHub‘s API capabilities stand apart in terms of:
- Adoption – By far the largest user base and community
- Documentation – Extremely thorough docs with examples
- REST + GraphQL – Flexibility of both REST and GraphQL endpoints
- Search Capabilities – Powerful indexed search across all public data
- Ecosystem – Huge ecosystem of apps, tools, and integrations
- Code Analysis – Code scanning, linting, and quality analysis capabilities
GitHub clearly leads in API functionality thanks to its scale and years of development. Other providers like GitLab and BitBucket are expanding API capabilities to compete. But for now GitHub remains the most fully-featured API for programmatically interacting with Git repositories.
Next Steps and Resources
I hope this guide provided a comprehensive overview of how to use the GitHub API with Python!
Here are some next steps and resources for further learning:
- Take a deeper dive into the docs: Developer Docs
- Watch API videos: GitHub Learning Lab
- Discover apps built with the API: GitHub Marketplace
- Learn more about OAuth: OAuth App Docs
- Check out libraries like PyGithub and urllib3 to simplify API usage
- Consider mocking responses to test API interaction
- Explore associated tools like GitHub CLI and GitHub Actions
The GitHub API opens up an entire world of possibilities for building developer tools, automating workflows, managing projects, and analyzing data. I hope you feel inspired to create something valuable to the community!
Happy coding!