|
|
![]() |
#1 |
Member
Join Date: Jun 2010
Posts: 376
|
![]()
I have a wordpress site that needs to be totally revamped. The content is OK, it's just everything else that needs to be redone.
I'd like to pull the text out - I don't need formatting and I definitely want there to NOT be the HTML tags. Is there any way to do this? |
![]() |
![]() |
![]() |
#2 |
Sysop
Join Date: Oct 2004
Posts: 10,478
|
![]()
I've never gotten around to doing anything with WP but I think there are a couple of members who have used it (still use it?). Until they arrive with their words of wisdom, I found a few links that might prove useful--in no particular order:
1. WP's own support info on exporting... 2. 3rd party info that looks pretty comprehensive... 3. Another 3rd party site that looks useful... Keep us posted on how it's going. Terrie |
![]() |
![]() |
![]() |
#3 |
Member
Join Date: Jun 2010
Posts: 376
|
![]()
Thanks!
I'll give a look at these. |
![]() |
![]() |
![]() |
#4 |
Member
Join Date: Jun 2010
Posts: 376
|
![]()
So, this is not really what I want. This will give me the XML and let me migrate the whole site over. But I don't want to do that. I really only want to pull the actual text. (I already have backups of the site.)
|
![]() |
![]() |
![]() |
#5 | |
Member
Join Date: May 2006
Location: Stringston, Somerset,UK
Posts: 236
|
![]() Quote:
Sounds to me that you need to be using regular expressions to get this done. Try searching Google with regex search to remove html tags which should throw up some options. You will need some basic scripting ability using something like javascript or python. Let me know if you need further help and I will try to assist but I am no expert. Barrie Greed |
|
![]() |
![]() |
![]() |
#6 |
Member
Join Date: Mar 2005
Location: Derby,UK
Posts: 1,509
|
![]()
Look into web scraping - Beautiful Soup is a Python based tool for this.
Since it is your website it is OK to scrape it, stealing other people's content isn't. Just sayin' |
![]() |
![]() |
![]() |
#7 |
Member
Join Date: Jun 2010
Posts: 376
|
![]()
I hadn't thought about web scraping. I mean most of the time, why would you need to scrape your own content, right? But it just might be the right approach here.
|
![]() |
![]() |
![]() |
#8 |
Staff
Join Date: Nov 2004
Posts: 7,710
|
![]()
Looks like there are lots of web scraper extensions for Chrome. And all sorts of other things when you google web scrape or the like. Some have clever names (Octoparse!) but I'm amazed that a google search doesn't turn up one named "ScrapeGoat". Seems so OBVIOUS!
__________________ Steve Rindsberg ==================== www.pptfaq.com www.pptools.com and stuff |
![]() |
![]() |
![]() |
#9 |
Sysop
Join Date: Oct 2004
Posts: 10,478
|
![]()
LOL!!!!
Terrie |
![]() |
![]() |
![]() |
#10 | |
Sysop
Join Date: Oct 2004
Posts: 10,478
|
![]() Quote:
My guess is that you have a lot of pages so that manually doing a select all and then copy would be too tedious? Terrie |
|
![]() |
![]() |
![]() |
Thread Tools | |
Display Modes | |
|
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Export InDesign to Photoshop as layered file | maxt | Print Production & Automation | 1 | 07-22-2014 01:00 PM |
WordPress 3.0 | ktinkel | Software | 8 | 06-24-2010 05:11 AM |
FontAgentPro: export/import libraries? | donat | Fonts & Typography | 13 | 02-28-2008 07:49 AM |
WordPress 2.2.1 available | ktinkel | Web Site Building & Maintenance | 20 | 06-23-2007 10:58 PM |
Automate InDesign pdf export | Krit | Print Design | 2 | 12-19-2006 03:10 PM |