





1:34 am

April 15, 2024

Hey there,
I recognize that the duplicate post detection is only on a feed-by-feed level, but is there any way to adopt that across all feeds? For example, if Feed A and Feed B have the same article, but titled slightly different, I end up with both articles. As far as I know, there's no way to pass parameters into the Content Filtering step. Is that correct?
Open to any suggestions on how I may avoid this. Just causing a lot of unnecessary clutter and duplications with multiple feeds in the same industry.
Thanks!
The uniqueness of the imported article is determined by the uniqueness of the link to the original text. Thus, if you have two feeds containing the same article, the plugin will add it only once, and will not create a copy. As an alternative or in addition to checking for uniqueness by link, you can also check for uniqueness by article title.
9:54 am

April 15, 2024

Unfortunately, unless the title is exactly the same, the system doesn't understand the article itself. For example, "New Jurassic Park Trailer" and "Watch Jurassic Park Trailer" would be interpreted as two separate articles, despite being the same thing.
I've semi-solved it for it using a custom PHP code, but it happens after the AI generation (and token usage) so there's a fair amount of waste there. Is there a way to do that kind of check before AI is engaged to rewrite?
If you have a vision of how this can be done in practice, please describe the algorithm for checking article text for uniqueness. Just keep in mind that the script can't check the text of each imported article against the text of each post available in your WP database, because such a resource-intensive check can easily overload your server, since your database may contain many thousands of quite large articles, the texts of which must be checked against the text of each individual post from each imported feed.
Most users run the plugin on shared hostings, and even checking for uniqueness by title or link separately can put a heavy load on virtual hardware in case if there are many items to check. This is actually the reason why it is recommended to do such a simplified check rather than do both title and link checks in one pass, if possible.
6:20 pm

April 15, 2024

Oh I understand. I definitely don't expect anything to be able to absorb an entire article to compare, but it would be helpful if the duplicate check had a sliding scale (right now, it's 100% match. If it were a 70-80% match, I'd be in better shape).
For example, here's the code I'm running now. It basically filters anything that uses 3 of the same words. It's not ideal, and needs some tweaking, but it's been helpful so far. However, as I said, it's running AFTER generating the AI rewrite, and it'd be more helpful if I could run it on the initial check.
Login to see the quote
egeekbiz said
Oh I understand. I definitely don't expect anything to be able to absorb an entire article to compare, but it would be helpful if the duplicate check had a sliding scale (right now, it's 100% match. If it were a 70-80% match, I'd be in better shape).
This can't be done with a standard MySQL query, so the text comparison will take even more time than a full-text comparison.
As for the snippet. If you want to tweak it a bit, I suggest you to use our GPT assistant which is familiar with the documentation: Login to see this link
Most Users Ever Online: 541
Currently Online:
10 Guest(s)
Currently Browsing this Page:
1 Guest(s)
Top Posters:
ninja321: 86
s.baryshev.aoasp: 68
Freedom: 61
harboot: 56
Pandermos: 54
MediFormatica: 49
Member Stats:
Guest Posters: 337
Members: 2923
Moderators: 0
Admins: 1
Forum Stats:
Groups: 1
Forums: 5
Topics: 1681
Posts: 8580
Newest Members:
apoc.signup, trananhb1, info.houstonyoungprofessionals, shelley.dbq, jim.limberis, fatihgungor133Administrators: CyberSEO: 4049