Skip to content

Not all links are found when custom target_content is set. #1

@spaceemotion

Description

@spaceemotion

I wanted to try out this package for an internal RAG presentation based on some of our company website data. Our website has the typical structure: header, main content and footer. Most of the links are in the header and footer part, but if I set target_content to only the main container, no links are getting found.

Is there a way to have all URLs collected outside of the container i ultimately want to parse into markdown? When i don't specify the target i am getting quite a mess of navigation and footer link data that repeats on every page (which would have to be cleaned up for every single page)..

Thanks in advance!


Edit: I think it was actually a combination of the domain, base path flags and the target content that returns zero results. once I removed all the extra flags, the target content worked as advertised.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions