6 Steps to Easily Parse Data from a Trusted Source

byGary Roberts| Updated: 03/02/2016 | Comments: 1

Search the Blog


Subscribe to the Blog

Set up your preferences for receiving email notifications when new blog articles are posted that match your areas of interest.


Area / Application

Product Category

Activity

Corporate / News

Enter your email address:



Suggest an Article

Is there a topic you would like to learn more about? Let us know. Please be as specific as possible.

Leave this field empty

parsing data in code

Would it be helpful to include data from a reputable source with your own data? If you have permission to use another source’s data for free or by agreement, how can you easily extract the specific data you want to use without doing a lot of coding?

In this article, I’ll show you how you can use an instruction in the CRBasic programming language to reap the benefit of a trusted source’s data while saving yourself a lot of time and effort. For example, you might want to pull data from a known, good source. The data may be stored on a government server or another source at no cost, and be offered in several different formats, including the eXtensible Markup Language (XML).

An Example of How Parsing Data Works

To highlight the six parsing steps, I’ll walk you through an example. In this example, we need the temperature in Fahrenheit from a NOAA weather station at an airport (KLGU). We want to use this temperature data with the data from a weather station that is just a couple of miles away.

The airport’s NOAA weather station data is hosted on NOAA web servers and is available for public use. By doing a quick search of their website, we found the airport’s weather station data here:http://w1.weather.gov/xml/current_obs/KLGU.xml. The data, including temperature, is posted hourly in XML format similar to this:

KLGU Airport weather data

To see the actual XML code, we have to right-click the web page and selectView page sourcefrom the menu. What we see looks similar to this:

XML code from KLGU Airport weather data

There is a lot of data in the XML code. If we used the normal programming methods, it would take us quite some time to do the coding. Fortunately, CRBasic has anXMLParse()instruction that we can use to save us a couple of hours of keyboarding time.

#1 - Declare the variables

To get started with theXMLParse()instruction, we need a few declared variables (constants). TheXMLParse()instruction uses these variables to know where it is in the XML file, if an error occurred, and if it is finished parsing.

XMLParse的返回值。我们使用这些track of where we are and if we have errors. Const XML_TOO_MANY_NAMESPACES = -3 ' Too many name space declarations encountered while parsing an element Const XML_NESTED_TOO_DEEP = -2 ' Too many nested XML elements Const XML_SYNTAX_ERROR_OR_FAILED = -1 ' XML syntax error or XMLParse failed Const XML_UNRECOGNIZED_ERROR_CONDITION = 0 ' Unrecognized error condition Const XML_START = 1 ' Start of XML element Const XML_ATTRIBUTE_READ = 2 ' XML attribute read. Const XML_END_OF_ELEMENT = 3 ' End of XML element Const XML_END_OF_DOCUMENT = 4 ' END of XML document encountered. 'XMLParse max settings so we don't use all of the datalogger's memory Const XML_MAX_DEPTH = 10 Const XML_MAX_NAMESPACES = 3

#2 - Use a variable to store the results

We also need a variable in which to store the results from theXMLParse()instruction:

Public noaa_air_temperature_f

#3 - Add variables for parsing

In addition, we need a few more variables for theXMLParse()instruction to use while it is parsing the XML file. We could useDimvariable declarations, but let’s usePublicvariables to aid in our troubleshooting.

Public xml_attribute_name As String Public xml_attribute_namespace As String * 100 Public xml_data As String * 3000 Public xml_element_name As String * 50 Public xml_element_namespace As String * 30 Public xml_response_code Public xml_state Public xml_value As String * 50

#4 - Add variables for file retrieval

To retrieve the XML file from the server, we are going to use theHTTPGet()instruction. For this instruction, we need to add a couple of variables:

Public xml_http_header As String * 300 Public xml_http_socket As Long

#5 - Add code to load the file

To get the XML file from the server and load it into thexml_datavariable, we need to add the following code somewhere in a slow sequence scan:

xml_http_header = "" xml_http_socket = HTTPGet("http://w1.weather.gov/xml/current_obs/KLGU.xml", xml_data, xml_http_header) TCPClose(xml_http_socket) 'Close our connection to the web server.

#6 - Add a while loop

To let theXMLParse()instruction do its work, we add a while loop (using theWhile/Wendinstruction) and set the initialxml_response_code.

xml_response_code = XML_START 'Tells XMLParse that we are just starting. While ((xml_response_code > XML_UNRECOGNIZED_ERROR_CONDITION) AND (xml_response_code <> XML_END_OF_DOCUMENT)) xml_response_code = XMLParse(xml_data, xml_value, xml_attribute_name, xml_attribute_namespace, _ xml_element_name, xml_element_namespace, XML_MAX_DEPTH, XML_MAX_NAMESPACES) If xml_response_code = XML_END_OF_ELEMENT AND xml_element_name = "temp_f" Then noaa_air_temperature_f = xml_value EndIf Wend

While theXMLParse()instruction is running the while loop, it is searching for the element namedtemp_f. When theXMLParse()instruction finds this element, it assigns the value positioned betweenandtonoaa_air_temperature_f.

We are now getting the value we wanted (temperature in Fahrenheit) from the NOAA station, and we can include it with our own weather station data.

More Information

If you have a CR1000, CR3000, CR800, CR850, or CR6 datalogger with an Ethernet interface, you candownload and run a working copy of this program.

推荐的佛r You:For an explanation of the different parts of an XML file, review the“XML Tree” section offered by w3schools.com. This web developer site has basic tutorials that detail the different elements, namespaces, and attributes that can be used in XML.

I hope this information was helpful to you. If you have any questions, please post them below.


Share This Article


About the Author

gary robertsGary Roberts is the Product Manager over communications and software products at Campbell Scientific, Inc. He spends his days researching new technology, turning solutions to problems into stellar products, doing second-tier support, or geeking out on Campbell gear. Gary's education and background are in Information Technology and Computer Science. When he's not at work, he is out enjoying the great outdoors with his Scouts, fighting fire/EMS, working amateur radio, or programming computers.

View all articles by this author.


Comments

OnAMission|03/29/2016 at 10:09 AM

Thanks for the great blog post! These advanced guides are very useful so please keep them comming!

Pleaselog in or registerto comment.

We're active on social media!
Stay informed with our latest updates by following us on these platforms:

Baidu