scraper

package
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 17, 2026 License: GPL-3.0 Imports: 9 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Option

type Option func(*Scraper)

Option is a function option for the Scraper

func WithConcurrency

func WithConcurrency(concurrency int) Option

WithConcurrency sets the concurrency level

func WithDepth

func WithDepth(depth int) Option

WithDepth sets the maximum crawling depth

func WithInternalOnly

func WithInternalOnly(internalOnly bool) Option

WithInternalOnly sets whether to only check internal links

func WithTimeout

func WithTimeout(timeoutSec int) Option

WithTimeout sets the timeout for HTTP requests

func WithVerbose

func WithVerbose(verbose bool) Option

WithVerbose enables verbose output

type QueueItem

type QueueItem struct {
	URL       string
	SourceURL string
	Depth     int
}

QueueItem represents a URL to be processed along with its source

type Result

type Result struct {
	URL        string `json:"url"`
	SourceURL  string `json:"source_url,omitempty"`
	Status     int    `json:"status"`
	Error      string `json:"error,omitempty"`
	Type       string `json:"type"` // link, image, script, stylesheet, css-import
	IsExternal bool   `json:"is_external"`
}

Result represents a URL check result

type Results

type Results struct {
	BaseURL   string   `json:"base_url"`
	Errors    []Result `json:"errors"`
	Successes []Result `json:"successes"`
	Total     int      `json:"total"`
}

Results is a collection of Result

type Scraper

type Scraper struct {
	// contains filtered or unexported fields
}

Scraper handles website crawling and link checking

func New

func New(options ...Option) *Scraper

New creates a new Scraper with the given options

func (*Scraper) Scan

func (s *Scraper) Scan(baseURL string) (*Results, error)

Scan starts the website crawling process

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL