URL2MDA
A fast tool to convert any website into LLM and AI Agent ready markdown data (MAGI markdown), with enhanced extraction for sites like Reddit, Twitter, and GitHub.
Usage Examples
Using curl:
$ curl 'https://url2mda.sno.ai/?url=https://example.com'
Using TypeScript (fetch):
import fetch from 'node-fetch'; // Or use browser fetch
const apiUrl = 'https://url2mda.sno.ai';
const targetUrl = 'https://www.youtube.com/watch?v=dQw4w9WgXcQ'; // Example YouTube URL
async function getMarkdown() {
try {
const fetchUrl = apiUrl + '?url=' + encodeURIComponent(targetUrl);
const response = await fetch(fetchUrl);
if (!response.ok) {
throw new Error('HTTP error! status: ' + response.status);
}
const markdown = await response.text();
console.log(markdown);
} catch (error) {
console.error('Error fetching markdown:', error);
}
}
getMarkdown();
Using Python (requests):
import requests
import urllib.parse
api_url = 'https://url2mda.sno.ai'
target_url = 'https://github.com/openai/gpt-3' # Example GitHub URL
params = {'url': target_url}
try:
response = requests.get(api_url, params=params)
response.raise_for_status() # Raise exception for bad status codes
markdown = response.text
print(markdown)
except requests.exceptions.RequestException as e:
print(f"Error fetching markdown: {e}")
Parameters
Required:
url
(string): The website URL to convert.
Optional:
-
subpages
(boolean, default:false
): Attempts to crawl and return markdown for up to 10 linked subpages found on the provided URL. When using text response format, all subpages will be combined into a single document with sections.# Example (curl) $ curl 'https://url2mda.sno.ai/?url=https://example.com/blog&subpages=true'
-
llmFilter
(boolean, default:false
): Processes the extracted markdown through an LLM to filter out boilerplate, ads, and other non-essential content.# Example (curl) $ curl 'https://url2mda.sno.ai/?url=https://example.com&llmFilter=true'
Response Types
- Default: Returns plain text markdown (
Content-Type: text/plain
).- With
subpages=true
: Returns a single combined document with sections for each subpage.
- With
- Use
Content-Type: application/json
header for JSON response.- For single page: Returns a JSON object with the URL and markdown content.
- With
subpages=true
: Returns a JSON array where each object represents a page.