Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik and ServiceNow Partner to Bring Trusted Enterprise Context into AI-Powered Workflows. Learn More!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Fragmentation PK on UUID for SQL Server

Hi everybody.
in a customer MDM project, we set Talend MDM on SQL Server 2008 R2 with talend 5.6.1.
After using autoIncrement, we decide to move on UUID generated in the job. So field's type for PrimaryKey are nvarchar(255).
I encounter the following problem: fragmentation are too important on each PK (99%)
the first big load records around 15 millions data in each entities.
weekly charges will records 200 000 lines.
I know that we can launch a reorganization but i should be too much 1 per week. . .

i try to change type in one entity setting in uniqueidentifier but same result
Labels (2)
7 Replies
Anonymous
Not applicable
Author

Hello,
the right datatype for UUID is either char(36) (this is a UUID: 0862164E-235D-4DD3-9964-905F471EFD06) or a databaseinternal type the dbms vendor recommends for this.
if you have problems with a corrupted PK i think you are in the wrong Forum, you should ask in a SQL-Server Forum

read here: stackoverflow.com/questions/4503501/microsoft-sql-server-2008-99-fragmentation-on-non-clustered-non-...
or here:  www.sqlskills.com/blogs/paul/can-guid-cluster-keys-cause-non-clustered-index-fragmentation/

anyway, AFAIK you first load data to a table, then create the index.


regards
dj
Anonymous
Not applicable
Author

Hi,
maybe i have to try char(36).
but Sorry, i think you don't understand my problem.
we launched the MDM project with SQL Server recommanded by Talend itself.
we were in autoIncrement (best practise from my customer) but in MDM, autoincrement is managed by an unique table, so to add data in parallel, we had bottleneck.
So to try with UUID, we change all PK type and now it's much better.
The last and not the least is when we generate UUID to inject, data is directly fragmented.
I have no problem with my PK, just bad performance due to fragmentation.
Maybe you have right, it's a SQL Server topic, but it is a typical case to use MDM with SQL server when we want to use UUID.
the goal is to find the best type or to organize UUID.
in the futur we will inject 100 000 new data a week, we can't sort UUID before injection or reorganize index every day. . . too heavy
Anonymous
Not applicable
Author

you can forget immediatly type char(36).
too much time too load datas and SQL server doesn't like it fot a PK type.
Moreover fragmentation stay the same.
Anonymous
Not applicable
Author

please Post the create table Statement with the pk ceration here
Anonymous
Not applicable
Author

why not but i don't understand the interest, Talend MDM generate itself the model and so the table.
you just have to choose string in the MDM model and it creates the table in SQL Server
by default: nvarchar(255) CLUSTERED for a pk in string
Anonymous
Not applicable
Author

First: i dont run MDM, so if you ask for help ...

Second: UUID is random data, randomdata is not good ( IMHO the worst)  for clustered indicies with heavy upload.
A Clustered index is ONLY good when you append data, otherwise the leafs of the table are reorged all the time.
Change this, if it is MDM internal open a case  with talend.
Anonymous
Not applicable
Author

Hi,
one more information, Talend MDM generate itself PK named "x_talend_id" when a sub table (complexetype in an entity) is created.
this x_talend_id is by default in UUID and it's amazing, this UUID is sorted so no fragmentation!

To conclude i know that there is a solution known by Talend. . . i try to open an issue