Best tool to handle 1000s of small txt files
Thread poster: Mark Berelekhis
Mark Berelekhis
Mark Berelekhis  Identity Verified
United States
Local time: 21:47
Russian to English
+ ...
Sep 24, 2012

Hi all,

I have 1600+ text files in 100+ folders (and subfolders) that need to be translated. My task is as follows:

1. Combine all the files into a single bilingual file.
2. Translate the file.
3. Split the bilingual file back to monolingual translated files while preserving the original folder structure (this is crucial).

Currently I use Wordfast Classic and MemoQ, but I'm not particularly proficient beyond their basic functions.

... See more
Hi all,

I have 1600+ text files in 100+ folders (and subfolders) that need to be translated. My task is as follows:

1. Combine all the files into a single bilingual file.
2. Translate the file.
3. Split the bilingual file back to monolingual translated files while preserving the original folder structure (this is crucial).

Currently I use Wordfast Classic and MemoQ, but I'm not particularly proficient beyond their basic functions.

Can either tool provide a solution? If not, which tool might? Any suggestions welcome.
Collapse


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 03:47
Member (2006)
English to Afrikaans
+ ...
I merge and split them Sep 24, 2012

Mark Berelekhis wrote:
I have 1600+ text files in 100+ folders (and subfolders) that need to be translated. My [need] is as follows:

1. Combine all the files into a single bilingual file.
2. Translate the file.
3. Split the bilingual file back to monolingual translated files while preserving the original folder structure (this is crucial).


For this type of job, I use Wordfast for the translation, and I use a pair of simple scripts to merge and split the files. The scripts are here:

http://wikisend.com/download/143892/merge%20and%20split%20files.zip

You have to install AutoIt to use them. First use the merge script, then afterwards use the split script. You'll notice that the full path of each file is in the merged file, so if you want them to be saved in a different top-level folder, simply change the folder names in the merged file. The UTF16LE scripts assume that your files are in UTF 16 LE, and the other scripts assume your files are in UTF8 with byte order mark. Your mileage may vary with other encodings.

A small bug: make sure before you split the files that the merged file has three blank lines at the top -- Wordfast tends to remove them when you clean the file.

Let me know if this works for you.

Samuel


 
Narcis Lozano Drago
Narcis Lozano Drago  Identity Verified
Spain
Local time: 03:47
Member (2007)
English to Spanish
+ ...
In MemoQ Sep 24, 2012

Import all the files and select them. Right click and select "Create View". Select "Simply glue documents together" and name the view. Open the view (it will be in the Views tab, not in the Documents tab). Translate the segments in the view; the progress in the files will be updated automatically.

Finally, select all the documents and export the final files.

Narcis


 
Mark Berelekhis
Mark Berelekhis  Identity Verified
United States
Local time: 21:47
Russian to English
+ ...
TOPIC STARTER
Thanks, Samuel Sep 24, 2012

Sadly, that goes well beyond my abilities with software. I was hoping there was an automated solution.

The project is large enough that it makes sense to invest into a new tool if it makes my life easier.


 
Egidijus Slepetys
Egidijus Slepetys  Identity Verified
Local time: 04:47
German to Lithuanian
Star Transit Sep 24, 2012

It can add an indefinite number of folders/subfolders (preserving the folder structure) and it can open all the files at once.
The most professional and expensive tool.


 
Grzegorz Gryc
Grzegorz Gryc  Identity Verified
Local time: 03:47
French to Polish
+ ...
DVX2 Sep 24, 2012

Egidijus Slepetys wrote:

It can add an indefinite number of folders/subfolders (preserving the folder structure) and it can open all the files at once.
The most professional and expensive tool.


DVX2 does the same but is far cheaper.
memoQ is also good but the way it displays the files (a flat list, the files with identical names are renamed) is IMO less handy.

Cheers
GG


 
Grzegorz Gryc
Grzegorz Gryc  Identity Verified
Local time: 03:47
French to Polish
+ ...
In memoQ, revisited... Sep 24, 2012

Narcis Lozano Drago wrote:

Import all the files and select them.


Create an empty project first, then use the "Import folder structure" function from the Dashboard.

Attention.
Place the source folder as close to the root as possible and check if the files were selected properly.
Sometimes memoQ 6 selects files from parent directory i.e. you can import far more files than you want...

Right click and select "Create View". Select "Simply glue documents together" and name the view. Open the view (it will be in the Views tab, not in the Documents tab). Translate the segments in the view; the progress in the files will be updated automatically.

Finally, select all the documents and export the final files.

Exactly.

Cheers
GG


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 03:47
Member (2006)
English to Afrikaans
+ ...
How automated do you want? Sep 24, 2012

Mark Berelekhis wrote:
Sadly, that goes well beyond my abilities with software. I was hoping there was an automated solution.


Erm, how automated do you want? Double-click the "merge" script. It asks you for the file extension, then it asks you where to save the merged file. Then it merges all files with that file extension in the current folder (and subfolders) into a single file. Then you translate the file, and when you're done, save the file as plain text again. The double-click the "split" script. It asks you which file you want to split. You select the translated file, and it splits it into separate files.

Many CAT tools can handle multiple files in multiple subdirectories, but you specifically wanted something that will display all text in a single pane, isn't that right?

Samuel


 
Mark Berelekhis
Mark Berelekhis  Identity Verified
United States
Local time: 21:47
Russian to English
+ ...
TOPIC STARTER
Thanks! Sep 24, 2012

Grzegorz Gryc wrote:

Narcis Lozano Drago wrote:

Import all the files and select them.


Create an empty project first, then use the "Import folder structure" function from the Dashboard.

Attention.
Place the source folder as close to the root as possible and check if the files were selected properly.
Sometimes memoQ 6 selects files from parent directory i.e. you can import far more files than you want...

Right click and select "Create View". Select "Simply glue documents together" and name the view. Open the view (it will be in the Views tab, not in the Documents tab). Translate the segments in the view; the progress in the files will be updated automatically.

Finally, select all the documents and export the final files.

Exactly.

Cheers
GG


Thanks very much. After fixing few hiccups with encoding, it worked perfectly.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Best tool to handle 1000s of small txt files







Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »