Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hey all,
I am loading data into catalog right now. And have some confusion about the registered entities.
From the documentation I realized that the registered entities will be loading metadata and profiling statistics, and are able to provide sample data for the users. However I got confused when looking at the data load logs where it shows the number of record loaded actually equals to the total records of that entity.
My question is: does it mean all the data still got fully loaded into catalog for registered entities? And from a space conservation standpoint, is it recommended to property src.file.glob to define the % of the sample retained to save the storage space?
Thank you!
Let me see if I can clarify. The first thing that we'll do to help here is to remove sample data from the discussion. The sample is independent of the profile.
When you load a registered entity all of the data (or whatever is in the entity's src.file.glob is transferred to the Catalog. The data is profiled and that temporary copy of the data is removed. Yes - you will utilize temporary storage while the profiling takes place, but all that is retained once that has completed is the profiling statistics and any sample data.
I hope this helps.
Let me see if I can clarify. The first thing that we'll do to help here is to remove sample data from the discussion. The sample is independent of the profile.
When you load a registered entity all of the data (or whatever is in the entity's src.file.glob is transferred to the Catalog. The data is profiled and that temporary copy of the data is removed. Yes - you will utilize temporary storage while the profiling takes place, but all that is retained once that has completed is the profiling statistics and any sample data.
I hope this helps.