PDA

View Full Version : Line & paragraph breaks in online text


Eric Ladner
12-15-2008, 03:52 PM
'm trying to copy a large amount of online text, in disconnected chunks, into either Word or a text editor, and thence to InDesign for page layout. (This is a legitimate project, with the author's permission, but copying the online text is the only way I can obtain the material.)

Word (2004, Mac, but I also have 2007 for Windows, if that helps) interprets the end of every line and the end of every paragraph, as a single paragraph break. TextWrangler sees them both as carriage returns. The problem is that, since they're all the same, I can't figure out how to remove the line breaks, which I don't want for page layout, without also removing the actual paragraph breaks.

When I look at the html, I see that lines are ended with <br> and paragraphs with <br><p>, so it seems there should be a way to get Word or the editor to recognize this distinction instead of stripping it out, as they seem to be doing.

I suppose I could work directly on the html, but then I'd have a lot of other garbage to clean up, too.

I'm really frustrated not to be able to figure this out for myself, but it seems like something that must come up frequently. Does anyone have a solution that would let text within paragraphs flow properly, but maintain the breaks between paragraphs?

Thanks, as always, for any suggestions.

--Eric

terrie
12-15-2008, 03:58 PM
I have experienced this also and what I do is copy paragraph sized chunks and when I paste, I skip at least 2 lines in whatever I'm pasting into (generally WordPerfect or sometimes even Wincim) which at least makes it easier to see the paragraphs.

I have always just manually gone in and deleted the line breaks where appropriate...

Not sure that helps you much but you have all my sympathy...'-}}

Terrie

Eric Ladner
12-15-2008, 04:22 PM
Thanks, Terrie, for the suggestions and for the sympathy.

It's just frustrating, since I can see that there is a difference in the html that would let me automate it all, if I could just get one of the other programs to recognize it. And it's a big enough project that it would save a lot of time.

But your method of putting in extra blank lines is really helpful; once I do that, then I can do a search and replace on multiple blank lines.

Thanks again!

--Eric

Eric Ladner
12-15-2008, 04:51 PM
Well, it turns out that Word 2007 for Windows does use hard line breaks at the ends of the lines, and proper paragraphs, so all I need to do is search and remove the line breaks. (And then shuttle it all back over to the Mac, where InDesign lives, but that shouldn't be a problem. I hope. I'll try it on a small sample before I go too far with it.)

I still suspect that the older version of Word on the Mac could do it, too, if I could just find the right switches to flip.

--Eric

ktinkel
12-15-2008, 06:09 PM
Why not make a preliminary pass in Word, and convert all the <br><p> sets to ### (that is what I used to use). Then convert all the <p> characters to a space.

Then go back and convert all the ### to paragraphs. Should do the trick.

Michael Rowley
12-16-2008, 07:54 AM
KT:

Then go back and convert all the ### to paragraphs. Should do the trick.Yes, Word makes every break in a line a paragraph mark, so there are at least two paragraph marks for every paragraph break. But even skilled typists using word often insert an extra paragraph mark at ever paragraph break, it is advisable to search for ever two paragraph mark repeatedly until no more are found.

ktinkel
12-16-2008, 09:20 AM
KT:

Yes, Word makes every break in a line a paragraph mark, so there are at least two paragraph marks for every paragraph break. But even skilled typists using word often insert an extra paragraph mark at ever paragraph break, it is advisable to search for ever two paragraph mark repeatedly until no more are found.So search for 5 of them; then 4; then 3. Pretty sure that by 2 you will have them all.

terrie
12-16-2008, 10:50 AM
eric: Thanks, Terrie, for the suggestions and for the sympathy.You're welcome...'-}}

What I like about WordPerfect is that there is a "reveal codes" option which will show you all the internal codes and it's very easy to copy the code for a search/replace...

Terrie