Lazy parsing inf Flex

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Lazy parsing inf Flex

Sameer M D
Hi,

I wanted to know how to implement lazy parsing of XML in Flex. As of now we
are processing a huge XML file, and the parsing  is performed using E4X. The
browser is getting crashed due to out of memory as the  XML which is getting
loaded is of huge size.

How to implement lazy parsing in Flex and whether lazy parsing solves the
problem of out of memory.

Thanks in advance




--
Sent from: http://apache-flex-users.2333346.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Lazy parsing inf Flex

Alex Harui-2
Hi Sameer,

The first question is, are you converting your XML to ActionScript
classes?  And if so, how?  If you are using the Flex XML Decoders, then
the following story may help you.

Many years ago, a customer was complaining about slow performance when
receiving a large XML (actually SOAP) response that they had Flex convert
to Value Objects (aka ActionScript Classes).  In my investigation, I
discovered that the Flex application only needed a fraction of the XML
tags in the response.  For example, (pardon if if email screws up the
formatting), the customer had a DataGrid with about 20 columns:

Employee Name, Address, City, State, Zip, Region, Hire Date, Exit Date,
Salary, Manager, Department, Phone, Home Phone, Mobile Phone, Social
Security, and a few more.

However, the XML for each employee contained lots of other data, including
salary history, benefits history, and more.  So, the XML might look like:

<employee>
  <firstName>Alex</firstName>
  <lastName>Harui</LastName>
  <address1>...</address1>
  <address2>...</address2>
  <city>...</city>
  <state>...</state>
  <zip>...</zip>
  <hireDate>...</hireDate>
And so on, but also
  <salaryHistory>
     <salary>
         <date>..</date>
         <pay>..</pay>
     </salary>
     <salary>
         <date>..</date>
         <pay>..</pay>
     </salary>
     <salary>
         <date>..</date>
         <pay>..</pay>
     </salary>
     ...
  </salaryHistory>
</employee>

Some records had dozens of salary and especially benefits history
transactions.  The problem with the default XML/SOAP decoding is that it
is going to visit *EVERY* XML tag.  It will walk the entire tree, and
create instances of ActionScript Value Objects for all sub-objects in the
tree.  Most of the time that's useful, but in this case, it was creating
1000's of SalaryHistory and BenefitHistory objects that were never used or
weren't used in the initial DataGrid.  They might have been used to
drill-down into the Employee in some other part of the UI.

So I showed them how to do Lazy decoding.  There is no such feature built
into Flex, you have to create your own.  Instead of using an Employee
class that might look like:

Class Employee
{
   public var firstName:String;
   public var lastName:String;
   ...
   public var salaryHistory:Array;
   public var benefitsHistory:Array;


}

And having XMLDecoder or SOAPDecoder run over the entire XML, we changed
the customer code to not use any decoder and use a custom lazy "decoder"
that pretty much does:

var employeeArray:Array = [];
var results:XMLList = xml.employee;
var n:int - results.length();
for (var i:int = 0; i < n; i++)
{
   employeeArray.push(new Employee(results[I]));
}







Then Employee is rewritten as:

Class Employee
{
  private var xml:XML;
  public function Employee(xml:XML)
  {
     this.xml = xml;
     // convert XML to AS properties
     firstName = xml.firstName;
     lastName = xml.lastName;
     ...
     // Only convert non-object properties like String, Number
     // Do not convert object properties like SalaryHistory and
BenefitsHistory
  }

  public var firstName:String;
  public var lastName:String;
  ...
  private var _salaryHistory:Array = null;
  public function get salaryHistory():Array
  {
     // generate SalaryHistory array on demand
     if (!_salaryHistory)
     {
         var history:XMLList = xml.salaryHistory;
         var n:int = history.length();
         for (var i:int = 0; i < n; i++)
            _salaryHistory.push(new SalaryHistory(history[I]));
     }
     return _salaryHistory;
  }
  private var _benefitsHistory:Array = null;
  public function get benefitsHistory():Array
  {
     // generate BenefitsyHistory array on demand
     if (!_benefitsHistory)
     {
         var history:XMLList = xml.benefitsHistory;
         Var n:int = history.length();
         For (var i:int = 0; i < n; i++)
            _benefitsHistory.push(new BenefitsHistory(history[I]));
     }
     return _benefitsHistory;
  }

}

No other changes to their code was required since the public property
names did not change in the Employee or SalaryHistory and BenefitHistory
classes.

IIRC, this improved conversion time by a factor of 20 (from over 1 minute
down to 2 or 3 seconds) and reduced memory consumption as well.  Only the
values actually needed were converted from XML to AS properties.  The
SalaryHistory and BenefitsHistory classes were written similarly to
Employee.  They had a constructor parameter that was the XML to convert
and converted only the fields they needed.

It also turns out that Flash lazily parses XML, so when the XML arrived as
a String from the server, Flash only converted the top-level tags into XML
nodes and just remember the strings of tags for SalaryHistory.  But the
default decoders, since they walked every tag caused Flash to generate an
internal node for every SalaryHistory tag.  With the lazy decoding, Flash
has to create many fewer internal nodes right away, also reducing memory
consumption.


HTH,
-Alex

On 2/15/18, 2:35 AM, "Sameer M D" <[hidden email]> wrote:

>Hi,
>
>I wanted to know how to implement lazy parsing of XML in Flex. As of now
>we
>are processing a huge XML file, and the parsing  is performed using E4X.
>The
>browser is getting crashed due to out of memory as the  XML which is
>getting
>loaded is of huge size.
>
>How to implement lazy parsing in Flex and whether lazy parsing solves the
>problem of out of memory.
>
>Thanks in advance
>
>
>
>
>--
>Sent from:
>https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fapache-fle
>x-users.2333346.n4.nabble.com%2F&data=02%7C01%7Caharui%40adobe.com%7C3b8f2
>525ed864a8123fe08d5746d4030%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C6
>36542934934347468&sdata=FmaALUnNZcNh%2B2aw5jTC3Ps6vqaHKOa4nO0JgKHYQrs%3D&r
>eserved=0

Reply | Threaded
Open this post in threaded view
|

Re: Lazy parsing inf Flex

Javier Guerrero García
As a rule of thumb, ALWAYS obey what Alex says about Flex :)

But, in this case, think also if you could planitize the XML structure (do
you really really REALLY need a tree to represent the data *to be displayed*,
or is it just because of a quick implementation server-side?) and use a
plain-simple CSV-like format, that parses with a plain split and converts
thousands of records in milliseconds (and drastically reduces data transfer
and memory consumption at the same time).

And as always, make sure that automatic updates are turned off if the
recipient object is binded somewhere else, or most of the processing time
will go to update the screen display just for invalidating it after
processing the next record (and so on :)

If you DO really really need a tree, JSON is also a much better option
bandwidth-wise (no need for an extra closing tag for each data), and can
represent your data as good as XML. Also, using tagnames like <fn> instead
of <firstName> etc. could save you a few Mb if you have a really big number
of records, hence memory consumption during the first parse. (same applies
to JSON, CSV already has it built-in by default :)

On Thu, Feb 15, 2018 at 9:40 PM, Alex Harui <[hidden email]>
wrote:

> Hi Sameer,
>
> The first question is, are you converting your XML to ActionScript
> classes?  And if so, how?  If you are using the Flex XML Decoders, then
> the following story may help you.
>
> Many years ago, a customer was complaining about slow performance when
> receiving a large XML (actually SOAP) response that they had Flex convert
> to Value Objects (aka ActionScript Classes).  In my investigation, I
> discovered that the Flex application only needed a fraction of the XML
> tags in the response.  For example, (pardon if if email screws up the
> formatting), the customer had a DataGrid with about 20 columns:
>
> Employee Name, Address, City, State, Zip, Region, Hire Date, Exit Date,
> Salary, Manager, Department, Phone, Home Phone, Mobile Phone, Social
> Security, and a few more.
>
> However, the XML for each employee contained lots of other data, including
> salary history, benefits history, and more.  So, the XML might look like:
>
> <employee>
>   <firstName>Alex</firstName>
>   <lastName>Harui</LastName>
>   <address1>...</address1>
>   <address2>...</address2>
>   <city>...</city>
>   <state>...</state>
>   <zip>...</zip>
>   <hireDate>...</hireDate>
> And so on, but also
>   <salaryHistory>
>      <salary>
>          <date>..</date>
>          <pay>..</pay>
>      </salary>
>      <salary>
>          <date>..</date>
>          <pay>..</pay>
>      </salary>
>      <salary>
>          <date>..</date>
>          <pay>..</pay>
>      </salary>
>      ...
>   </salaryHistory>
> </employee>
>
> Some records had dozens of salary and especially benefits history
> transactions.  The problem with the default XML/SOAP decoding is that it
> is going to visit *EVERY* XML tag.  It will walk the entire tree, and
> create instances of ActionScript Value Objects for all sub-objects in the
> tree.  Most of the time that's useful, but in this case, it was creating
> 1000's of SalaryHistory and BenefitHistory objects that were never used or
> weren't used in the initial DataGrid.  They might have been used to
> drill-down into the Employee in some other part of the UI.
>
> So I showed them how to do Lazy decoding.  There is no such feature built
> into Flex, you have to create your own.  Instead of using an Employee
> class that might look like:
>
> Class Employee
> {
>    public var firstName:String;
>    public var lastName:String;
>    ...
>    public var salaryHistory:Array;
>    public var benefitsHistory:Array;
>
>
> }
>
> And having XMLDecoder or SOAPDecoder run over the entire XML, we changed
> the customer code to not use any decoder and use a custom lazy "decoder"
> that pretty much does:
>
> var employeeArray:Array = [];
> var results:XMLList = xml.employee;
> var n:int - results.length();
> for (var i:int = 0; i < n; i++)
> {
>    employeeArray.push(new Employee(results[I]));
> }
>
>
>
>
>
>
>
> Then Employee is rewritten as:
>
> Class Employee
> {
>   private var xml:XML;
>   public function Employee(xml:XML)
>   {
>      this.xml = xml;
>      // convert XML to AS properties
>      firstName = xml.firstName;
>      lastName = xml.lastName;
>      ...
>      // Only convert non-object properties like String, Number
>      // Do not convert object properties like SalaryHistory and
> BenefitsHistory
>   }
>
>   public var firstName:String;
>   public var lastName:String;
>   ...
>   private var _salaryHistory:Array = null;
>   public function get salaryHistory():Array
>   {
>      // generate SalaryHistory array on demand
>      if (!_salaryHistory)
>      {
>          var history:XMLList = xml.salaryHistory;
>          var n:int = history.length();
>          for (var i:int = 0; i < n; i++)
>             _salaryHistory.push(new SalaryHistory(history[I]));
>      }
>      return _salaryHistory;
>   }
>   private var _benefitsHistory:Array = null;
>   public function get benefitsHistory():Array
>   {
>      // generate BenefitsyHistory array on demand
>      if (!_benefitsHistory)
>      {
>          var history:XMLList = xml.benefitsHistory;
>          Var n:int = history.length();
>          For (var i:int = 0; i < n; i++)
>             _benefitsHistory.push(new BenefitsHistory(history[I]));
>      }
>      return _benefitsHistory;
>   }
>
> }
>
> No other changes to their code was required since the public property
> names did not change in the Employee or SalaryHistory and BenefitHistory
> classes.
>
> IIRC, this improved conversion time by a factor of 20 (from over 1 minute
> down to 2 or 3 seconds) and reduced memory consumption as well.  Only the
> values actually needed were converted from XML to AS properties.  The
> SalaryHistory and BenefitsHistory classes were written similarly to
> Employee.  They had a constructor parameter that was the XML to convert
> and converted only the fields they needed.
>
> It also turns out that Flash lazily parses XML, so when the XML arrived as
> a String from the server, Flash only converted the top-level tags into XML
> nodes and just remember the strings of tags for SalaryHistory.  But the
> default decoders, since they walked every tag caused Flash to generate an
> internal node for every SalaryHistory tag.  With the lazy decoding, Flash
> has to create many fewer internal nodes right away, also reducing memory
> consumption.
>
>
> HTH,
> -Alex
>
> On 2/15/18, 2:35 AM, "Sameer M D" <[hidden email]> wrote:
>
> >Hi,
> >
> >I wanted to know how to implement lazy parsing of XML in Flex. As of now
> >we
> >are processing a huge XML file, and the parsing  is performed using E4X.
> >The
> >browser is getting crashed due to out of memory as the  XML which is
> >getting
> >loaded is of huge size.
> >
> >How to implement lazy parsing in Flex and whether lazy parsing solves the
> >problem of out of memory.
> >
> >Thanks in advance
> >
> >
> >
> >
> >--
> >Sent from:
> >https://na01.safelinks.protection.outlook.com/?url=
> http%3A%2F%2Fapache-fle
> >x-users.2333346.n4.nabble.com%2F&data=02%7C01%7Caharui%40adobe.com
> %7C3b8f2
> >525ed864a8123fe08d5746d4030%7Cfa7b1b5a7b34438794aed2c178de
> cee1%7C0%7C0%7C6
> >36542934934347468&sdata=FmaALUnNZcNh%2B2aw5jTC3Ps6vqaHKOa4nO0JgKHYQ
> rs%3D&r
> >eserved=0
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Lazy parsing inf Flex

Harbs
In reply to this post by Alex Harui-2
That’s really interesting. Currently in Royale, we’re using browser methods to parse XML, so the entire tree is parsed into a DOM tree at once.

I wonder if that’s something that we can/should optimize…

Harbs

> On Feb 15, 2018, at 10:40 PM, Alex Harui <[hidden email]> wrote:
>
> It also turns out that Flash lazily parses XML, so when the XML arrived as
> a String from the server, Flash only converted the top-level tags into XML
> nodes and just remember the strings of tags for SalaryHistory.

Reply | Threaded
Open this post in threaded view
|

Re: Lazy parsing inf Flex

Jeff Dafoe-2

In my opinion, a huge XML tree sent to the client is a design flaw.  If it's coming from a 3rd party source, it should be proxied onto a server under your control.  From there it should be trimmed down and/or parameterized to send the smallest possible payload to the client.

-Jeff
________________________________
From: Gabe Harbs <[hidden email]>
Sent: Friday, February 16, 2018 5:23 AM
To: [hidden email]
Subject: Re: Lazy parsing inf Flex

That’s really interesting. Currently in Royale, we’re using browser methods to parse XML, so the entire tree is parsed into a DOM tree at once.

I wonder if that’s something that we can/should optimize…

Harbs

> On Feb 15, 2018, at 10:40 PM, Alex Harui <[hidden email]> wrote:
>
> It also turns out that Flash lazily parses XML, so when the XML arrived as
> a String from the server, Flash only converted the top-level tags into XML
> nodes and just remember the strings of tags for SalaryHistory.

Reply | Threaded
Open this post in threaded view
|

Re: Lazy parsing inf Flex

Alex Harui-2
In reply to this post by Harbs
I haven't looked at the Royale XML code, but because most everything is
replaceable, you can always just grab the String from the server result or
wherever you got it and hand it off to a custom parser.

The DataBindingExample in Royale uses a lazy JSON parser, just so I could
show that lazy parsing could be part of the framework.   I have no idea if
it is faster than using the browser to parse JSON or in your case, XML,
but I think we've made it possible.

It really comes down to whether you have control over the result set.
Once your result set is larger than, say, 1MB of text or maybe even 250K,
the odds that you can display all of that text on the screen at once is
very low.  So, if you can avoid bringing down stuff you aren't going to
display right away, you will save bandwidth costs and get better
performance.  But some folks don't have a choice since they are using some
legacy web service.  The lazy parsing answer assumes that they can't
change the web service result set.

If you can change the result set, you should try paging it in.  Flex
supports on-demand paging, or you can page in the background by asking for
the first 50 rows or so, then on the resultEvent, ask for the next 50 and
append them to the collection.

My 2 cents,
-Alex

On 2/16/18, 2:23 AM, "Gabe Harbs" <[hidden email]> wrote:

>That’s really interesting. Currently in Royale, we’re using browser
>methods to parse XML, so the entire tree is parsed into a DOM tree at
>once.
>
>I wonder if that’s something that we can/should optimize…
>
>Harbs
>
>> On Feb 15, 2018, at 10:40 PM, Alex Harui <[hidden email]>
>>wrote:
>>
>> It also turns out that Flash lazily parses XML, so when the XML arrived
>>as
>> a String from the server, Flash only converted the top-level tags into
>>XML
>> nodes and just remember the strings of tags for SalaryHistory.
>