Topic: Must run macro several times to get all changes (1 of 12), Read 24 times
Conf: VEDIT Macro Language Support
From: Bert Hyman
Date: Wednesday, February 18, 2009 01:48 PM

Vedit 6.15.2 under Win XP Pro SP3.

I have a macro that does multiple RE replacements in a large file (on the order of 16-20MB). I invoke the macro from a batch file as "vpw -q -x strip.vdm filename.txt". The structure is pretty much a straight copy from the manual:

Margin_Right(0)
while(!AT_EOF)
{
File_Read(0)
Replace("this", "that", BEGIN+NOERR+ALL+LOCAL+REGEXP)
Replace("foo", "bla", BEGIN+NOERR+ALL+LOCAL+REGEXP)
...
Replace("some", "thing", BEGIN+NOERR+ALL+LOCAL+REGEXP)
Replace("or","else", BEGIN+NOERR+ALL+LOCAL+REGEXP)
FILE_WRITE(ALL)
}
XALL

On first invocation, it runs to completion, but then I find that not all the replacements have been made. If I run it again, the leftovers are usually done, but sometimes it takes yet another run.

What's up?

 


Topic: Re: Must run macro several times to get all changes (2 of 12), Read 16 times
Conf: VEDIT Macro Language Support
From: Fritz Heberlein
Date: Wednesday, February 18, 2009 02:06 PM

You might want to try huge-sr.vdm (see the doco at BOF).

Fritz

 


Topic: Re: Must run macro several times to get all changes (3 of 12), Read 20 times
Conf: VEDIT Macro Language Support
From: Christian Ziemski
Date: Wednesday, February 18, 2009 02:55 PM

On 18.02.2009 19:49 vedit-macro-language Listmanager wrote:
> From: "Bert Hyman"
>
> I have a macro that does multiple RE replacements in a large file (on the order of 16-20MB).
>
> Margin_Right(0)
> while(!AT_EOF)
> {
> File_Read(0)
> Replace("this", "that", BEGIN+NOERR+ALL+LOCAL+REGEXP)
> ...
> FILE_WRITE(ALL)
> }


From the help:

"Note: With VEDIT's auto buffering File_Read() and File_Write()
are very, very rarely needed."

So simply don't use them.

(And 20MB files are not really large.)


So:

Replace("this", "that", BEGIN+NOERR+ALL+LOCAL+REGEXP)
Replace("foo", "bla", BEGIN+NOERR+ALL+LOCAL+REGEXP)
...

should do it.


Christian

 


Topic: Re: Must run macro several times to get all changes (4 of 12), Read 20 times
Conf: VEDIT Macro Language Support
From: Bert Hyman
Date: Wednesday, February 18, 2009 03:05 PM

On 2/18/2009 2:55:42 PM, Christian Ziemski wrote:
>
>"Note: With VEDIT's auto buffering
>File_Read() and File_Write()
> are very, very rarely needed."
>
>So simply don't use them.

Thanks; I'll give that a go on tomorrows run.

>(And 20MB files are not really large.)

Just showing my age, I guess (he says, looking at the VEDIT PLUS 2.03d install floppy on the shelf).

 


Topic: Re: Must run macro several times to get all changes (5 of 12), Read 21 times
Conf: VEDIT Macro Language Support
From: Christian Ziemski
Date: Wednesday, February 18, 2009 03:40 PM

On 18.02.2009 21:05 vedit-macro-language Listmanager wrote:
> From: "Bert Hyman"
>
> On 2/18/2009 2:55:42 PM, Christian Ziemski wrote:
>
>> (And 20MB files are not really large.)
>
> Just showing my age, I guess (he says, looking at the
> VEDIT PLUS 2.03d install floppy on the shelf).

Funny, I should have some VEDIT install floppies somewhere too.
But I'm a youngster: I started with version 3.6.

;-)

Christian

 


Topic: Re: Must run macro several times to get all changes (6 of 12), Read 18 times
Conf: VEDIT Macro Language Support
From: Bert Hyman
Date: Friday, February 20, 2009 09:49 AM

On 2/18/2009 3:05:10 PM, Bert Hyman wrote:
>On 2/18/2009 2:55:42 PM, Christian
>Ziemski wrote:
>>
>>"Note: With VEDIT's auto buffering
>>File_Read() and File_Write()
>> are very, very rarely needed."
>>
>>So simply don't use them.
>
>Thanks; I'll give that a go on tomorrows
>run.

Works just fine, and is no slower than the method using the buffering scheme.

Thanks.

 


Topic: Re: Must run macro several times to get all change (7 of 12), Read 20 times
Conf: VEDIT Macro Language Support
From: Pauli Lindgren
Date: Monday, February 23, 2009 04:09 AM

On 2/18/2009 2:55:42 PM, Christian Ziemski wrote:
>
>Replace("this", "that", BEGIN+NOERR+ALL+LOCAL+REGEXP)
>Replace("foo", "bla", BEGIN+NOERR+ALL+LOCAL+REGEXP)
>...
>
>should do it.

Does that work? If I understand it correctly, the LOCAL option causes replace to be done only on the part of file that is currently in memory. That is why the explicit read/write commands are needed. If you just omit them, only the buffered part of the file (128k?) will be processed. Therefore you should omit the LOCAL keyword, too.

The code in the PDF manual is different from huge-sr.vdm, so the manual should be fixed. I think the problem is that if the search string happens to be at the edge of the buffered block, it is not found. That is why huge-sr.vdm uses Replace_Block and limits the search to full lines. But that still does not work with multi-line patterns.

Another way to do multiple search is to use search pattern |{}. For example:
Repeat(ALL) {
Search("|{fee,fie,foe,foo}", ERRBREAK)
if (Match_Item==1) { Replace("fee", "plugh" }
if (Match_Item==2) { Replace("fie", "xyzzy" }
...
}

This should not cause any problems with buffering.

 


Topic: Re: Must run macro several times to get all change (8 of 12), Read 23 times
Conf: VEDIT Macro Language Support
From: Ian Binnie
Date: Monday, February 23, 2009 06:25 AM

On 2/23/2009 4:09:02 AM, Pauli Lindgren wrote:
>On 2/18/2009 2:55:42 PM, Christian
>Ziemski wrote:
>>
>>Replace("this", "that", BEGIN+NOERR+ALL+LOCAL+REGEXP)
>>Replace("foo", "bla", BEGIN+NOERR+ALL+LOCAL+REGEXP)
>>...
>>
>>should do it.
>
>Does that work? If I understand it
>correctly, the LOCAL option causes
>replace to be done only on the part of
>file that is currently in memory. That
>is why the explicit read/write commands
>are needed. If you just omit them, only
>the buffered part of the file (128k?)
>will be processed. Therefore you should
>omit the LOCAL keyword, too.
>
>The code in the PDF manual is different
>from huge-sr.vdm, so the manual should
>be fixed. I think the problem is that if
>the search string happens to be at the
>edge of the buffered block, it is not
>found. That is why huge-sr.vdm uses
>Replace_Block and limits the search to
>full lines. But that still does not work
>with multi-line patterns.
>
>Another way to do multiple search is to
>use search pattern |{}. For example:
>Repeat(ALL) {
> Search("|{fee,fie,foe,foo}", ERRBREAK)
>if (Match_Item==1) { Replace("fee",
>"plugh" }
>if (Match_Item==2) { Replace("fie",
>"xyzzy" }
> ...
>}
>This should not cause any problems with
>buffering.

Pauli,

I agree with many of your comments, and while I haven't analysed the macros in detail, I agree that the explicit buffering causes problems, unless managed carefully.

What most people don't realise is that with the improved virtual memory management in Windows 2K/XP in conjunction with the current RAM (512MB is considered small) the vedit buffering is probable irrelevant.

Unless you are editing multi GB files there is no need to worry about vedit buffering. While it will perform multiple "disk" read/writes these will be to virtual memory, and little slower than explicitly managing buffering.

Certainly for tiny 20MB files, there is no need.

 


Topic: Re: Must run macro several times to get all change (10 of 12), Read 20 times
Conf: VEDIT Macro Language Support
From: Pauli Lindgren
Date: Tuesday, February 24, 2009 09:35 AM

On 2/23/2009 6:25:38 AM, Ian Binnie wrote:
>
>What most people don't realise is that
>with the improved virtual memory
>management in Windows 2K/XP in
>conjunction with the current RAM (512MB
>is considered small) the vedit buffering
>is probable irrelevant.
>
>Unless you are editing multi GB files
>there is no need to worry about vedit
>buffering. While it will perform
>multiple "disk" read/writes these will
>be to virtual memory, and little slower
>than explicitly managing buffering.
>
>Certainly for tiny 20MB files, there is
>no need.

Vedit does not use Windows virtual memory to handle large files.
But I guess you are talking about the disk cache. Disk cache is probably much slower than virtual memory (and Vedits disk buffering is supposed to be faster than the virtual memory of Windows). So I think LOCAL option should give significant performance boost already with few tens of megabytes, if you are performing many replacements.

(Linux has more advanced virtual memory, so maybe Vedit would not need its own disk buffering there.)

By the way, it seems that the multi search pattern |{} is not very fast. With 5.5MB file, search for "|125" (which is found near the end of file) takes around 0.3 seconds. But search "|{|123,|125}" takes 3 seconds, i.e. 10 times longer. So it seems that the multi pattern search on Vedit is not very optimized.

--
Pauli

 


Topic: Re: Must run macro several times to get all change (11 of 12), Read 19 times
Conf: VEDIT Macro Language Support
From: Ted Green
Date: Tuesday, February 24, 2009 10:12 AM

At 09:36 AM 2/24/2009, you wrote:
>From: "Pauli Lindgren"
>
>By the way, it seems that the multi search pattern |{} is not very fast. With 5.5MB file, search for "|125" (which is found near the end of file) takes around 0.3 seconds. But search "|{|123,|125}" takes 3 seconds, i.e. 10 times longer. So it seems that the multi pattern search on Vedit is not very optimized.

You are correct that the "|" slows down the search significantly. It could be optimized, but that is currently a low priority.

Ted.

 


Topic: Re: Must run macro several times to get all change (12 of 12), Read 20 times
Conf: VEDIT Macro Language Support
From: Ian Binnie
Date: Tuesday, February 24, 2009 05:48 PM

On 2/24/2009 9:35:55 AM, Pauli Lindgren wrote:
>On 2/23/2009 6:25:38 AM, Ian Binnie
>wrote:

>Vedit does not use Windows virtual
>memory to handle large files.

Vedit can't help using virtual memory, but I agree that it doesn't utilise this to load large files.

Editing a 50MB file in Vedit64 only loads ~115k into the buffer, and takes 5 sec to go to the end of the file.
Notepad loads the whole file into memory.

>But I guess you are talking about the
>disk cache.

There are multiple levels of caching in a modern computer.
Most disk drives have several MB of hard cache, and Windows also caches data. In most applications it is much faster to re-open a file which has been previously opened.

> Disk cache is probably much
>slower than virtual memory (and Vedits
>disk buffering is supposed to be faster
>than the virtual memory of Windows).

Here you are confusing memory paging to disk with virtual memory mapping, although the latter is also used for paging.

>So
>I think LOCAL option should give
>significant performance boost already
>with few tens of megabytes, if you are
>performing many replacements.

Even HUGE-SR.VDM only recommends for files >20MB

>By the way, it seems that the multi
>search pattern |{} is not very fast.

I only use these to search within blocks, where they can simplify logic, without slowing things down too much.

It is a pity that Vedit uses such small buffers. This may have been necessary with Win3.1, but not now.

I used to use complex buffering to load images, now I just allocate enough memory to load the whole file.

 


Topic: Re: Must run macro several times to get all change (9 of 12), Read 21 times
Conf: VEDIT Macro Language Support
From: Christian Ziemski
Date: Monday, February 23, 2009 07:05 AM

On 2/23/2009 4:09:02 AM, Pauli Lindgren wrote:
>On 2/18/2009 2:55:42 PM, Christian Ziemski wrote:
>>
>>Replace("this", "that", BEGIN+NOERR+ALL+LOCAL+REGEXP)
>>Replace("foo", "bla", BEGIN+NOERR+ALL+LOCAL+REGEXP)
>>...
>>should do it.
>
>Does that work? If I understand it correctly, the LOCAL option causes
>replace to be done only on the part of file that is currently in memory.
> [...]
> Therefore you should omit the LOCAL keyword, too.

Pauli, you are right.
I used copy/paste without care :-(

Christian