--Apple-Mail=_9567BFE4-CCC2-43E7-9FDB-CB186B1D255C
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=iso-8859-1


On 24 Nov 2013, at 17:29, Robert Klemme <shortcutter / googlemail.com> =
wrote:
>=20
> Only the part where you use "load". ;-)  We use require for loading
> modules so we avoid loading them multiple times. For relative loading
> you can use require_relative.
Oh, I didn't know about require_relative. It makes sense now. Thank you =
once again.
> If you move all the other code in classes in separate files I would
> probably only leave the main command line processing and top level
> logic in the main .rb file.
>=20
> <snip/>
That's the ideal I striving for.
>>> It's not clear to me why you use a second thread to fill the
>>> connection pool.
>> I am doing this to have connections ready approximately at the same =
time as data is prepared(encoded).
>=20
> I would have assumed that connecting is much cheaper than preparation
> of the data.  So you would not gain that much.  But it may be
> different in your case, of course.
>=20
>>> One thing in your original description stood out:
>>>=20
>>>> For this, I create thread pool(thread limit =3D connection limit)
>>>=20
>>> If you have only as many threads for processing as there are
>>> connections then you do not need a connection pool. It's then much
>>> more efficient to open a connection on thread start, use it as long =
as
>>> the thread runs and close it when it finishes.  That avoids
>>> synchronization overhead that you have when taking a connection from
>>> the pool and returning it.
>> I've switched to a connection pool, because it was quite difficult to =
track the state of threads and their relationship to data(are they =
finished already or is there still some data chunks left). So I moved to =
single jobs, which could, theoretically, have data and connection as a =
parameters. It's much more easy to operate with them on this level, but =
I've run into strange behaviour on the network level.
>=20
> That is all a bit foggy to me.  Also, I still haven't really grokked
> what your program is supposed to do.  Can you give a birds eye view?
> Do you just prepare data and send it off or are you getting something
> back as well?
I will explain the whole concept in layman's terms. Please excuse me, if =
some of this wasn't necessary.
Usenet - internet's predecessor is like one big forum. You can post, you =
can read. Everything is done via NNTP protocol. To post binary data, it =
needs to be converted to ASCII first(8-bit). The most common encoding =
method is yEnc.=20
So, my program is supposed to encode the given files and post them, =
where they should be and be as efficient at this as it is possible. =
Because of efficiency part, I have switched to inline C for encoding. =
But now I've hit another wall - 5MB/s is the limit I am able to achieve =
with my program(and I see these strange drops too). It is puzzle which I =
am planning to solve.
Communication between program and server is almost one way. Almost, =
because I need to receive server's responses. Otherwise I won't know if =
posting was successful.
>>> If preparing the data for transfer is an expensive task (i.e. takes
>>> some time) then the one connection per thread approach might be
>>> inefficient since the connection will sit there idly most of the =
time
>>> yet use some resources on the server side.  In those situations it =
may
>>> be worthwhile to use a connection pool.
>> Preparing data was really expensive before I rewrote this particular =
part of the code in inline C. Now it takes just about 2 seconds per few =
hundred megabytes of binary data. Still expensive, but in almost all =
scenarios bottleneck will be network, not the data preparation.
>=20
> If data preparation is so cheap then maybe a ratio of 2:1 for threads
> to connections is in order.  Then on average two threads "share" a
> connection and one thread can be preparing the data while the other
> one is using the connection to send.
I am currently using only one thread(although with priority slightly =
above normal) to prepare data. I think that this is sufficient. All =
other threads are there only to fetch the data and upload them in most =
efficient manner. I know that I'll be remodelling my upload threads, I =
just don't know how.

--Apple-Mail=_9567BFE4-CCC2-43E7-9FDB-CB186B1D255C
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
	filename=signature.asc
Content-Type: application/pgp-signature;
	name=signature.asc
Content-Description: Message signed with OpenPGP using GPGMail

-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - http://gpgtools.org

iQIcBAEBAgAGBQJSkjhFAAoJEC0WlNH4dX4BNZ0P/RhKSIogJwgvnkmkhZ+jqdA8
Gk3Y45gtb9WxA9ZQst/PktP750SkLLTd1XGAHLJRpQPOCX7tu9fCckxT5yY1XZOd
QnXBFju81n32NXa7TqlQA4lqiOcyRPRvH62oJHwfBY6uIEdmuZ3suSF1vyTszkYo
a/aesIAq2xbJK8XqvwALl5lm3JqmB9m/f2Gp37uTe+gFYWN3t8igVu+2zZ9MagtW
8VEHclSUQFN43Kia5N7/cUbNXyBuX9SXMzGDgPe4TeEMFq41BUbIIsiUnr3+z20z
SsDbaPbjsXsIjkJSf8ApwQnyWu5GiztXiNr5FzN214Sh/iX1yE5XG6th0uMKFcsF
kr51f0kNSgfQZiqwTEwHhssFZyCqxRBV/s9rwxmDas87mmYndw69eZ6Bud4DnMzM
hlOR+VYop+CaHYSWNoeLy2YSnTXaVWBDt8swIiHOjWXRNXp4gSdceUhKXtmR/xRP
IB2trgSL/VBilG1MV9mVMqA1hhIbulQtDgF5GQyegCqeahXqw3YXgmXTx7ZtFvdK
m5c8u4QnxWS9jnt8s/kYvm7IJTZABghc41h3OXkLIJHNXpxGiN/NMRmD/I0fjfYR
zQD27pL0WKUpq4rC2YVOWWLy4ZPRO4E7fSeX0h0Hp7CD4epKwqYpfwqNPeVMFuhI
4ZSiV2mDQGvjbmR95m9w
=QY1v
-----END PGP SIGNATURE-----

--Apple-Mail=_9567BFE4-CCC2-43E7-9FDB-CB186B1D255C--