Hello Kousik,
Handling full data load via continuous load is possible and won't violate the SD primary key.
Which is different from the loadid that will be generated by this continuous load.
I will explain those working principles below via an example.
For example, in my case, we can find loads 11 and 12 that have been submitted via my continuous load :
That being said, the fact that continuous load can handle full data load, doesn't mean that it is the most suitable implementation, it all depends on your context.
Because, as you might know, full data load will result in higher volumetry in source tables, as well as historic tables if you have activated them (you might then need to keep the concept of purge in mind).
So depending on your context, delta detection might make more sense, I will leave you the discretion to analyze this on your end.
I hope this clarifies how continuous load is working.
Wishing you a good day
Best regards,
Alexia
Thank you very much for this informative and highly helpful information.
Kousik Das
Hi ,
I have a requirement as below.
1. Every time full set of data to be loaded through Continuous Load (through a INTEGRATE_ALL integration job)
2. Unlike in the tutorial example as below
(xdm-tutorials/integration/3-load-data-via-sql/sqlserver/7-update-1-insert-sd-person.sql at master · semarchy/xdm-tutorials · GitHub),
we are not planning to do any duplicate detection during the load i.e. every time full set of data will be pushed to SD table.
Technically it should be feasible, but I have few doubts as below:
1. Should we need to delete/truncate SD table before every submit or keep that as is in the SD tables?
As per my understanding if I use continuous load, then B_LOAD will be reserved and reused for that particular Continuous load name, so after first
submission and execution through this Continuous load, next time If I try to simply insert full set of data again with Same B_LOAD ID, B_SOURCEID,
B_PUBID combination to a SD_table of a Fuzzy Entity it will not allow me to do that as above 3 columns are system defined constraint in SD table.
So, if there is some update in any other column in 2nd set of data that will not be considered.
2. Secondly, with reference to above scenario,
a) How (strategically) full set of data can be loaded every time through continuous load facility?
b) Is delta detection is mandatory before inserting to Semarchy SD table to configure and use Continuous load?
Please help me with your valuable suggestions and also let me know if I have missed to consider anything already present in Semarchy.
Regards,
Kousik