photo

Mike

shared this idea
3 years ago

Employees Involved

photo

SCM

Admin

Statistics

4
Comments
1
Views

Share

Tags

56
votes

custom article scraper (scrape html mode)

(sorry for bad english)for example, I wanna scrape all url from http://eva.vn/ example: http://eva.vn/bep-eva/ngay-he-ghe-hang-che-sen-pho-co-noi-tieng-dat-ha-thanh-c162a269199.html

but in this url, there are many part of article I don't want to using it in my blog, like: sapo, social butom, relate post.... I just want to scrape article, title, html tag like <h2> tag, <b> tag....

How can I do it?

nmkoSw2

Under Consideration
+1 I like this idea
Add Comment

Comments (4)

photo
45

Please help!

God bless you! :D

photo Employee
42

Please use xpath.

However, the xpath can only scrape 1 node.

So you can tell it to scrape H2

Or you can tell it to scrape only P tags

Eg to scrape all h2 in article

//body//h2

Unfortunately the scraper will only allow you to scrape down 1 node (not 2).

photo
40

SCM wrote:

Please use xpath.

However, the xpath can only scrape 1 node.

So you can tell it to scrape H2

Or you can tell it to scrape only P tags

Eg to scrape all h2 in article

//body//h2

Unfortunately the scraper will only allow you to scrape down 1 node (not 2).

and how can I scrape both article and h2 tag without sapo tag and without releate post tag? :( wp pipes can do it, very simple, but I love Seo Content Machine, I love suport team in this forum, I can't stop using SCM :(

And, finally, I think this is a helpfully idea that I can share for your next update :(

please help, thank you very much!

God bless you!

(sorry for bad English :( )

photo Employee
35

Mike wrote:

SCM wrote:

Please use xpath.

However, the xpath can only scrape 1 node.

So you can tell it to scrape H2

Or you can tell it to scrape only P tags

Eg to scrape all h2 in article

//body//h2

Unfortunately the scraper will only allow you to scrape down 1 node (not 2).

and how can I scrape both article and h2 tag without sapo tag and without releate post tag? :( wp pipes can do it, very simple, but I love Seo Content Machine, I love suport team in this forum, I can't stop using SCM :(

And, finally, I think this is a helpfully idea that I can share for your next update :(

please help, thank you very much!

God bless you!

(sorry for bad English :( )

What you want is currently not possible,

You need to have 1 pass for scrape h2 tag, then another pass to scrape related post tag.

Can't do it all in one scrape as you can only select one xpath node at one time.

Leave Comment

photo