It's just a JSON file, so you can use it in any environment. Sourced from GitHub's Linguist project (defines all 500+ programming languages known to GitHub). Data is updated via script and released via new package version.
pip install programming-languagesimport programming_languages
py_lang_data = programming_languages['Python']
print(py_lang_data['extensions']) # => ['.cgi', '.fcgi', '.gyp', ...]Note: Most type checkers will falsely warn programming_languages is not subscriptable because they are incapable of analyzing runtime behavior (where the module is replaced w/ a dictionary for cleaner, direct access). You can safely suppress such warnings using # type: ignore.
Get language from an extension:
def get_lang(file_ext):
for lang, data in programming_languages.items():
if file_ext in data['extensions']:
return lang
print(get_lang('.al')) # => 'AL'Get language from a file path:
def get_lang_from_path(filepath):
from pathlib import Path
file_ext = Path(filepath).suffix
for lang, data in programming_languages.items():
if file_ext in data['extensions']:
return lang
print(get_lang_from_path('main.rs')) # => 'Rust'
print(get_lang_from_path('script.kt')) # => 'Kotlin'
print(get_lang_from_path('data.avsc')) # => None (use data-languages pkg)Copyright © 2026 Adam Lui
</> markup-languages - File extensions for markup languages.
## prose-languages - File extensions for prose languages.
{ } data-languages - File extensions for data languages.
