Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Currently, my spark job managed to output result from the log.info() function. What I was trying to do now is to fetch and store the output into a proper mysql database using talend component.
Result from spark:
INFO:spark2:Source rows = 2692687
INFO:jspark2:Destination rows = 2692687
My idea is to use these workflow to achieve the task objective:
tSystem (submit spark job) >> tFileInputDelimited (fetch the printing result from spark) >> <some mysql component to store result to msql>
Is it possible to do this way?
Hello,
You can read and parse the logs produced by spark and use a tFileInputDelimited -> tFilterRow -> tDBOutput(MySQL)
Another option would be to use the outputLine of tSystem if the job you call produces console logs.
Log4j messages themselves could be forwarded to a database. One would have to use JDBCAppender for log4j.
In case of Log4J2 the config I used (for a POC!) was:
<JDBC name="dbAppender" tableName="log4j2.all_log" connectionSource="PoolingDriver">
<DriverManager connectionString="jdbc:postgresql://localhost:5432/postgres"
driverClassName="org.postgresql.Driver" username="postgres" password=":)" />
<Column name="moment" isEventTimestamp="true" />
<Column name="origin" isUnicode="false" pattern="%replace{%msg}{(.+) - (.*)}{$1}" />
<Column name="message" isUnicode="false" pattern="%replace{%msg}{(.+) - (.*)}{$2}" />
<Column name="mdc" isUnicode="false" pattern="%X" />
<Column name="level" isUnicode="false" pattern="%level" />
<Filters>
<MapFilter onMatch="NEUTRAL" onMismatch="ACCEPT">
<KeyValuePair key="_pid" value="0"/>
</MapFilter>
<RegexFilter regex="^(connectionStatsLogs|talendStats_|talendMeter_|talendLogs_|tLogRow_).+" onMatch="DENY" onMismatch="ACCEPT"/>
<RegexFilter regex=".+ - Parameters:.+" onMatch="DENY" onMismatch="ACCEPT"/>
</Filters>
</JDBC>
This config is not for production use as it creates a lot of new connections.
Hi,
currently I found other alternative to fetch the string output using:
tSystem_1 (print the output) -> tJava (get the output and bring to next component)
tSystem_1 output:
('source count:', '1000')
('destination count:', '100')
tJava code:
String output=((String)globalMap.get("tSystem_1_OUTPUT"));
System.out.println("Printing the error code 1 : "+StringUtils.substringBetween(output,"source count:", "destination count:"));
tJava output:
Printing the error code 1 : ', '1000')
('
However, tJava takes the result exactly between string " source count:' " and " destination count: ", so that included the bracket and all. The expected result is only to get the value 1000.
How do I apply this java code to fetch the correct output?