Line breaks in Excel: how to transform formatting string into a tag?
Thread poster: XLTS
XLTS
XLTS  Identity Verified
Germany
Local time: 19:12
Member (2011)
English to German
+ ...
Dec 15, 2021

In Studio (2021), line breaks within the columns of my Excel file (which I split to individual files due their large sizes) are displayed as "_x000D_" in the resulting XLIFF files. What do I have to do to transform this string into a tag? I do not seek to use this formatting string as a segmentation rule as I am using the same TM for several (extremely large) other files I already have worked upon and fear corruption of the document structure upon reconverting my translations into Excel f... See more
In Studio (2021), line breaks within the columns of my Excel file (which I split to individual files due their large sizes) are displayed as "_x000D_" in the resulting XLIFF files. What do I have to do to transform this string into a tag? I do not seek to use this formatting string as a segmentation rule as I am using the same TM for several (extremely large) other files I already have worked upon and fear corruption of the document structure upon reconverting my translations into Excel files.Collapse


 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 20:12
English to Russian
Embedded Content Dec 15, 2021

You can try the 'Embedded content' settings (File Types — Excel). Add the expression as a placeholder.

 
XLTS
XLTS  Identity Verified
Germany
Local time: 19:12
Member (2011)
English to German
+ ...
TOPIC STARTER
Embedded content settings Dec 16, 2021

Stepan Konev wrote:

You can try the 'Embedded content' settings (File Types — Excel). Add the expression as a placeholder.


Thanks you for trying to help. You mean, just the string "_x000D_", or do I have to integrate it into some regular expression? I tried just the string (using the "Bilingual Excel" file type as I am dealing with a partly translated file), both with "Extract to defined document structures" and "Extract to all paragraphs" (or whatever the English versions of the options may be, I use the German UI), and this didn't change anything when I created a new project with the same Excel source file...


 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 20:12
English to Russian
\n Dec 16, 2021

Try this regex: \n
To avoid importing and re-importing your file(s), use the Preview feature (flagged below).
2021-12-16_121801


 
XLTS
XLTS  Identity Verified
Germany
Local time: 19:12
Member (2011)
English to German
+ ...
TOPIC STARTER
\n and _x000D_ Dec 16, 2021

Stepan Konev wrote:

Try this regex: \n
To avoid importing and re-importing your file(s), use the Preview feature (flagged below).


It works in the preview, but not in "real life". I tried both adding the same file under a different name and creating a new project, to no avail. Thank you for your efforts, but this seems to be yet another shortcoming of the eternal bugware Trados Studio, and I cannot afford losing more time.

EDIT: I successfully have performed these steps on a different computer (with SR1 instead of SR2 and Office 2016 instead of 2019). Thank you!

[Bearbeitet am 2021-12-16 11:51 GMT]


 
Joakim Braun
Joakim Braun  Identity Verified
Sweden
Local time: 19:12
German to Swedish
+ ...
CR or CRLF? Dec 16, 2021

Line breaks are indicated with a variety of character sequences across different platforms. The customary Windows linebreak is 0x0D0A (carriage-return + linefeed, two characters), but from your post it looks like perhaps only a carriage-return character (0x0D) is present in the Excel file.

Just an observation, I don't know if it makes a difference or what to do about it.


 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 20:12
English to Russian
Need to re-import from scratch Dec 16, 2021

XLTS wrote:
I successfully have performed these steps on a different computer (with SR1 instead of SR2...
Actually it has nothing to do with SR1 or SR2 or SRx.
Though you can use Preview when setting a regex to see the result, once you have set the regex in Project Settings for a specific project, you need to re-import the file.
This means not just removing the target sdlxliff alone and then preparing the source file again, but particularly removing both files from the source and target folders (you can do it by switching flags on the left pane, in 'Files' view) and then adding and preparing the file from scratch.
When you moved to another computer and Studio, you did it all from scratch. That's exactly why you succeeded. But not because it was SR1.

[Edited at 2021-12-16 12:26 GMT]


 
XLTS
XLTS  Identity Verified
Germany
Local time: 19:12
Member (2011)
English to German
+ ...
TOPIC STARTER
Further problems with Excel line breaks Dec 16, 2021

Stepan Konev wrote:

but particularly removing both files from the source and target folders


You mean, it isn’t enough to delete the project, but I also have to copy the source file to a different folder?

you can do it by switching flags on the left pane, in 'Files' view


I am afraid I am not with you there: what exactly do you mean by switching flags, please?

BTW, a few hours onward, as every 3-4 months I encountered the problem "Object reference not set to an instance of an object", so once more, I had to delete the folders in c:\Users\[USERNAME]\AppData\Roaming\SDL\SDL Trados Studio\, after which Studio seemed to behave "reasonably" again also on my desktop PC (until further notice).

Subsequently, I ventured into trying to use the string "[x000D]" as a segmentation rule after all. I initially somehow succeeded, at least partially: the string was recognized correctly after entire sentences, but not after numbers; also, double line-breaks ([[x000D][x000D]) continued to appear (as tags). The annoying thing about this is, after I (finally, after unsuccessfully trying different variations) changed the settings to the values shown below, I didn’t even succeed anymore in creating the "partly" successful first result: now, all of these strings are displayed as tags again. In my numerous attempts I always deleted the project folder, but kept the same source file in the same source folder. When I eventually deleted the placeholder definition, the string still was converted to a tag until I closed Studio altogether and reopened it. Do I have to do this after each and every unsuccessful attempt?!

I would be very grateful for any suggestion how to escape from this quagmire.

Studio2

Studio1


 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 20:12
English to Russian
Flags Dec 16, 2021

XLTS wrote:
You mean, it isn’t enough to delete the project, but I also have to copy the source file to a different folder?
Nope. When you prepare a file for translation, Studio creates 2 sdlxliff files: one in the source language folder and one, with the same name, in the target language folder. You can find both files in Windows explorer and you can also see them in Studio when you click the Files view. In Files view, when you select the source language flag, you see the source sdlxliff files:
source
When you select the target language flag, you see the target sdlxliff files:
target
You don't need to remove your project. But what you do need is removing sdlxliff files displayed under the source language flag. When you delete files from source, it also deletes files from target automatically. On the other hand, if you only delete files under the target language flag, the source files will be still there. That is why you have to delete files from the source language flag, not target.
Then you can add files to the same project with new settings and prepare the files for translation.

[Edited at 2021-12-16 20:28 GMT]


 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 20:12
English to Russian
Try this regular expression instead: .[\n]+ Dec 16, 2021

XLTS wrote:
Subsequently, I ventured into trying to use the string "[x000D]" as a segmentation rule after all.
Recently, a similar question was posted at the RWS Community.
Try the video suggested there: https://www.youtube.com/watch?app=desktop&v=kPaHs5xjWyU
In this case, the topic starter couldn't achieve the desired result most probably because they didn't re-import the files. Once you have introduced any new settings or properties into your project, you always have to re-create sdlxliff files. This is a must. It is not enough just to remove the target sdlxliff files only. You have to re-create the source sdlxliff files either. That's why you have to delete the older source sdlxliff files too.

[Edited at 2021-12-16 20:38 GMT]


 
XLTS
XLTS  Identity Verified
Germany
Local time: 19:12
Member (2011)
English to German
+ ...
TOPIC STARTER
\n won't work either Dec 17, 2021

Stepan Konev wrote:

You don't need to remove your project.


To delete the project may not be necessary, but nonetheless should "do the trick". The fact that is doesn't seems to be proof that this is yet another would-be feature of Studio's that doesn't work the way it is supposed to.



I know about this video, and actually, the suggestion it contained was the first thing I tried the day before yesterday, but all this "\n" segmention rule does in my case (and I always started a new project) is oddly splitting sentences into two or more segments by cutting words which end upon the letter "x" (e.g. ex|igus, ex|trèmes, ex|tracteurs)... %-)

BTW, at long last, I have grasped what you meant by "switching flags": it just didn't occur to me that you might refer to an actual country flag, when all the time I had a file property in my mind.



[EDIT: minor corrections]

[Bearbeitet am 2021-12-17 01:22 GMT]


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Line breaks in Excel: how to transform formatting string into a tag?







CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »