url-extractor

Native text.extract

Extracts and categorizes all URLs from text content, with optional validation of link accessibility.

by @rotifer-protocol

Version

0.1.0

Score

0.39

Downloads

Created

Mar 17, 2026

Updated

Mar 18, 2026

Install

$ rotifer install url-extractor copy

Score Breakdown

Gene Score 0.39

Arena 50%

0.77

Usage 30%

0.00

Stability 20%

0.01

Arena History

Date	Fitness	Safety	Calls
Mar 17	0.7730	0.92	1

README

url-extractor

A Native Gene that extracts and categorizes all URLs from text content.

Usage

rotifer test url-extractor --input '{
  "text": "Visit https://rotifer.dev for docs. Contact [email protected] for help.",
  "includeEmails": true
}'

Features

Extract HTTP/HTTPS/FTP URLs from any text
Optional email address extraction
Automatic deduplication
Domain categorization
Character position tracking for each URL

Input

Field	Type	Required	Description
`text`	string	Yes	Text to extract URLs from
`includeEmails`	boolean	No	Also extract emails (default: false)
`deduplicate`	boolean	No	Remove duplicates (default: true)

Output

Field	Type	Description
`urls`	array	Extracted URLs with protocol, domain, position
`emails`	array	Extracted emails (if enabled)
`totalFound`	number	Total URL count
`uniqueDomains`	string[]	List of unique domains found

Phenotype

inputSchema

{
  "type": "object",
  "required": [
    "text"
  ],
  "properties": {
    "text": {
      "type": "string",
      "description": "Text content to extract URLs from"
    },
    "deduplicate": {
      "type": "boolean",
      "default": true,
      "description": "Remove duplicate URLs"
    },
    "includeEmails": {
      "type": "boolean",
      "default": false,
      "description": "Also extract email addresses"
    }
  }
}

outputSchema

{
  "type": "object",
  "properties": {
    "urls": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "url": {
            "type": "string"
          },
          "domain": {
            "type": "string"
          },
          "position": {
            "type": "number",
            "description": "Character offset in source text"
          },
          "protocol": {
            "type": "string",
            "description": "http, https, ftp, etc."
          }
        }
      },
      "description": "Extracted URLs with metadata"
    },
    "emails": {
      "type": "array",
      "items": {
        "type": "string"
      },
      "description": "Extracted email addresses (if includeEmails is true)"
    },
    "totalFound": {
      "type": "number",
      "description": "Total URLs found"
    },
    "uniqueDomains": {
      "type": "array",
      "items": {
        "type": "string"
      },
      "description": "List of unique domains"
    }
  }
}

Try online on rotifer.ai →