May 4, 2011

Syndication feed list not fully executed. | CyberSEO Pro | Support Forum

Avatar

Lost password?
Advanced Search

— Forum Scope —




— Match —





— Forum Options —





Minimum search word length is 3 characters - maximum search word length is 84 characters

No permission to create posts
sp_Related Related Topics sp_TopicIcon
Syndication feed list not fully executed.
Topic Rating: 0 Topic Rating: 0 Topic Rating: 0 Topic Rating: 0 Topic Rating: 0 Topic Rating: 0 (0 votes) 
July 19, 2017
3:21 pm
Avatar
s.popadiin@gmail.com
Member
Members
Forum Posts: 32
Member Since:
May 11, 2017
sp_UserOfflineSmall Offline

I have CyberSEO using cron on server every 1 minute.

I have 20 feeds I am pulling content from. Each feed pulls only 1 post per run. Each feed runs between one minute, so feed one runs every 1 minute, feed two every 2 minutes ... feed twenty runs every 20 minutes. This way was the only possible way to make feeds run in distributed way! I think with this regard much improvement is needed for better scheduling.

The problem is that everything runs ok until feed 15, the last few just stay at next run ASAP and last run 12344567 minutes.

Once I run them manually, they pull the post and start to wait their schedule, but they never run again.

 

Thanks for any help.

July 20, 2017
12:22 am
Avatar
CyberSEO
Admin
Forum Posts: 3950
Member Since:
July 2, 2009
sp_UserOfflineSmall Offline

First of all, 1 minute is a very short period. I'd suggest to make it at least 10 minutes or so. As about the issue with pulling of the last feeds, seems the script exceeds the max execution time. I suggest you to increase it (max_execution_time in php.ini).

July 20, 2017
5:47 am
Avatar
s.popadiin@gmail.com
Member
Members
Forum Posts: 32
Member Since:
May 11, 2017
sp_UserOfflineSmall Offline

If I make 10 minute between pulling feeds, than last feed number 20 will pull each 200 minutes what is too rare. To compensate this I should make each feed pull at least 3 articles what will put again more load on server. Without a better scheduling the task is unsolvable, because first feeds will always overlap with last in time. You can not make each feed pull in its unique period of time. If you have more than 10 feeds, always the first feed will overlap with last.

I would suggest that CyberSEO should have two modes of scheduling. First "every X minutes" like it is now, Second "every specified time" cron like, ex. every first 5 minutes of every hour.

Can you implement this as a bug fix? I think it will be a very nice improvement for your next release.

 

My max_execution_time in php.ini is already set to 300 sec.

July 25, 2017
10:13 am
Avatar
CyberSEO
Admin
Forum Posts: 3950
Member Since:
July 2, 2009
sp_UserOfflineSmall Offline

This is not a bug and I'm not planning to implement this feature in the near future, because it will break compatibility with old versions of CyberSEO and CyberSyn plugins.

July 25, 2017
12:56 pm
Avatar
s.popadiin@gmail.com
Member
Members
Forum Posts: 32
Member Since:
May 11, 2017
sp_UserOfflineSmall Offline

Ok, I understand your constrain.

However, how do you suggest best way to pull 20 and more feeds in terms of scheduling and post per pull?

You know better how it works.

July 28, 2017
5:00 am
Avatar
s.popadiin@gmail.com
Member
Members
Forum Posts: 32
Member Since:
May 11, 2017
sp_UserOfflineSmall Offline

I reset cron to every 10 minutes, because obviously cron can not finish for 1 minute and it would be cron over cron, although it worked ok for some time until feeds became in ASAP state.

I have 30 feeds, how should I set my scheduling so all of them will execute?

July 29, 2017
7:27 am
Avatar
CyberSEO
Admin
Forum Posts: 3950
Member Since:
July 2, 2009
sp_UserOfflineSmall Offline

Login to see this link said
However, how do you suggest best way to pull 20 and more feeds in terms of scheduling and post per pull?

That depends on your server capabilities. If it's a rather weak one, you can schedule the plugin to pull the feeds say 20 times a day in a different time.

July 29, 2017
7:53 am
Avatar
s.popadiin@gmail.com
Member
Members
Forum Posts: 32
Member Since:
May 11, 2017
sp_UserOfflineSmall Offline

You mean the cron 20 times?

I am asking about pulling 30 feeds scheduling.

Anyway. Can explain a bit more about cron and scheduling relationship.

Lets say my cron is each 1 hour.

And I have two feeds, 1st pulled each 20 minutes and second pulled each 30 minutes.

Once the cron runs it will pull both feeds. The clock will start counting, but feeds will be pulled only the next time cron is executed no matter their scheduling, and because their time is expired, both feeds will be pulled in the same time.

Is this logic correct? Or somehow after cron feeds count the time to their exact time of pulling :)

July 31, 2017
4:35 am
Avatar
CyberSEO
Admin
Forum Posts: 3950
Member Since:
July 2, 2009
sp_UserOfflineSmall Offline

Yes, that's correct. This is why your cron time must be less than the shortest feed update period. So if you have set the feeds to be pulled once per 20 and 30 minutes, the cron must run every 10 minutes.

July 31, 2017
5:59 am
Avatar
s.popadiin@gmail.com
Member
Members
Forum Posts: 32
Member Since:
May 11, 2017
sp_UserOfflineSmall Offline
10sp_Permalink sp_Print
0

Sorry, to say it, but you have developed a VERY BAD LOGIC for scheduling.

It is exactly because cron times and pull times use different logic, it makes a mess.

In case of many feeds, cron can not have high period value, because feeds will be pulled too rare and will always overlap pulls what puts A LOT of pressure on PHP to wait all those simultaneously running feed pulls. That is where most of things break like spins and translations. It is too much in the same time.

You can not have cron run with low period value because cron starts to permanently eat the server no matter if its pulling new feed or just checking for new feed, and in case there is a new feed it may happen that cron is not finished when another starts.

To set a proper timing and system is huge pain and maybe not 100% possible because it depends and on so so so many factors.

The top thing that messes all up is the fact that no matter how you make the scheduling so you could have an order for each feed, a period between feeds, pull times after little time also messes up when pull has been unsuccessful or when you pulled manually or when cron has delayed or when php time expired and pull did not finish. Than any order you have is broken and the mess is complete.

I am working also with another plugin WPeMatico, for example, where pulling schedule is 1:1 to cron logic and it proves much better.

I advise you to develop a fix that would have both schedulings so non of legacy versions will be useless.

The only solution for this bad logic is a very powerful server. I would also advise creating a hidden WP installation only for pulling full text feeds and only than copying or repulling those feeds to publishing site. This way excessive resources could be given to pulling site that will not make server vulnerable in case of http overload.

The best setup I managed to achieve is setting a cron to each minute. This way the order of my feeds made by setting first feed to 5 minutes, second to 10 minutes, etc... is always respected to matter what happens. I have setup first feed to 10 minutes period, second to 13 and every next with 3 minutes more. This way you always have a 10 minutes to guarantee you after the last pull has finished. And the 3 minute period will give you overlapping for every 10th feed while still having at least 3 minute period for the pull itself. 

In conclusion if CYBERSEO development team fixes this logic, it will fix half of these support posts on this forum, will fix php errors due to overloads, unexplained problems with third party APIs, etc.

July 31, 2017
6:06 am
Avatar
s.popadiin@gmail.com
Member
Members
Forum Posts: 32
Member Since:
May 11, 2017
sp_UserOfflineSmall Offline
11sp_Permalink sp_Print
0

admin said

Login to see this link said
However, how do you suggest best way to pull 20 and more feeds in terms of scheduling and post per pull?

That depends on your server capabilities. If it's a rather weak one, you can schedule the plugin to pull the feeds say 20 times a day in a different time.  

The suggested setup with 1 minute cron is contrary to your suggestion. It runs on a 1.5GB-1core VPS. It did only that, no traffic.

I have tried your suggestion with 10 minute cron on a 4GB-2core VPS and it could not bare it.

The overlapping of feed pulls puts so much stress on PHP that only first feeds are pulled. 

July 31, 2017
6:36 am
Avatar
CyberSEO
Admin
Forum Posts: 3950
Member Since:
July 2, 2009
sp_UserOfflineSmall Offline
12sp_Permalink sp_Print
0

Login to see this link said
Sorry, to say it, but you have developed a VERY BAD LOGIC for scheduling.

Don't like the existing implementation of scheduler? Ok, suggest your own! Feel free to explain it here. Maybe it will be better then the current one and I'll implement it.

July 31, 2017
8:08 am
Avatar
s.popadiin@gmail.com
Member
Members
Forum Posts: 32
Member Since:
May 11, 2017
sp_UserOfflineSmall Offline
13sp_Permalink sp_Print
0

admin said

Login to see this link said
Sorry, to say it, but you have developed a VERY BAD LOGIC for scheduling.

Don't like the existing implementation of scheduler? Ok, suggest your own! Feel free to explain it here. Maybe it will be better then the current one and I'll implement it.  

Hey Dev/Admin, dont get offended. Its not about the liking or not liking. Its about most durable-simple logic of function and to serve purpose. Second its about scalability and less support. 

I will suggest in another thread

July 31, 2017
9:59 am
Avatar
CyberSEO
Admin
Forum Posts: 3950
Member Since:
July 2, 2009
sp_UserOfflineSmall Offline
14sp_Permalink sp_Print
0

Unfortunately the most durable-simple logic is usually limited by the existing technologies.

August 27, 2017
7:28 am
Avatar
s.popadiin@gmail.com
Member
Members
Forum Posts: 32
Member Since:
May 11, 2017
sp_UserOfflineSmall Offline
15sp_Permalink sp_Print
0

I writing to report that I migrated my install to a very powerful server. Now everything works as supposed. I did not increase execution times nor memory, every setting is around default values. BUT (a big but), my CPU times are so high that it looks like I am mining some cryptocurrencies.

I checked and all CPU is dedicated to fivefilters.php aka the script that pulls full text. Than I checked is it really pulling and it comes out that it is just checking for new articles to pull.

So again we are in a trap because of the wrong scheduling logic. More specifically the problem with overlapping feed pulling times.  I could increase my intervals but this will not solve the overlapping, hence my 1st feed will run 10 times while my last feed will run once for a certain amount of time, no matter what intervals you set, except if you set 0, but than all feeds will pull in the same time what is impossible.

 

PS. Hope to see a plugin update with a fix soon.

August 28, 2017
1:37 am
Avatar
CyberSEO
Admin
Forum Posts: 3950
Member Since:
July 2, 2009
sp_UserOfflineSmall Offline
16sp_Permalink sp_Print
0

The fivefilters script is not a part of CyberSEO and is distributed under GPL license as a stand-alone code. I have no control over it and I can't guarantee anything about it. You decide if you want to use it or not. Technically, you can use any other full text extractor instead fivefilters - for CyberSEO it's just a service which can be run at any distant 3rd-party host.

August 28, 2017
3:49 pm
Avatar
s.popadiin@gmail.com
Member
Members
Forum Posts: 32
Member Since:
May 11, 2017
sp_UserOfflineSmall Offline
17sp_Permalink sp_Print
0

I dont agree.

Its CyberSEO that controls when fivefilters script is run. And it runs it in chaotic, bad controlled way.

Once again I repeat. If you have two feeds to pull, first once per hour, Second once per two hours. First feed will run twice for those two ours. You can not make those two feeds run once per those two hours!!!
IT IS IMPOSSIBLE! Prove me wrong, please. 

And if you add ten feeds, guess what happens .. overlapping, overlapping, overlapping, overlapping, overlapping, 

August 29, 2017
12:16 am
Avatar
CyberSEO
Admin
Forum Posts: 3950
Member Since:
July 2, 2009
sp_UserOfflineSmall Offline
18sp_Permalink sp_Print
0

CyberSEO does not control fivefilters script in any way. It's a stand-alone service which can run at any distant host. For example you can use the original URL: Login to see this link and... it will work.

As about your second question. Set 1st feed to be pulled once a hour and another one once per hour and a half. This is an obvious solution, IMHO.

August 29, 2017
3:49 am
Avatar
s.popadiin@gmail.com
Member
Members
Forum Posts: 32
Member Since:
May 11, 2017
sp_UserOfflineSmall Offline
19sp_Permalink sp_Print
0

) Yes, this time you are right, but almost. If you take a period of 3 hours, the problem persists. Can you give an obvious solution with 30 feeds?

Is is also obvious that some feeds will always be more occasional than others.

I think its time you to admit that the scheduling is not done the best way.

August 29, 2017
4:59 am
Avatar
CyberSEO
Admin
Forum Posts: 3950
Member Since:
July 2, 2009
sp_UserOfflineSmall Offline
20sp_Permalink sp_Print
0

1st feed: once every 60 minutes
2nd feed: once every 62 minutes
3rd feed: once every 64 minutes
and so on...

Want to pull them not every hour? Do it like this:
1st feed: once every 60 minutes
2nd feed: once every 242 minutes
3rd feed: once every 384 minutes

I don't see any problem there.

No permission to create posts
Forum Timezone: Europe/Amsterdam

Most Users Ever Online: 541

Currently Online:
16 Guest(s)

Currently Browsing this Page:
1 Guest(s)

Top Posters:

ninja321: 84

s.baryshev.aoasp: 68

Freedom: 61

Pandermos: 54

MediFormatica: 49

B8europe: 48

Member Stats:

Guest Posters: 337

Members: 2856

Moderators: 0

Admins: 1

Forum Stats:

Groups: 1

Forums: 5

Topics: 1643

Posts: 8359

Newest Members:

info.conversieonline, samuel2288, comercios.cercademi, wanmarkets, torontomark48, info.ckmedianetwork

Administrators: CyberSEO: 3950