|
...find out what I did this summer in 300 words or less)
Update
Hi TL, I need to write an abstract (as though for a scientific publication) as a preliminary step in documenting the work I've done in a research internship this summer.
The work I did was actually not too complicated, based on the software aspect of forensic DNA analysis.
I would like you to give me feedback on whether or not the following makes sense, and whether it provides an adequate summary. Of course you don't know the details, so this is an outside looking in sort of test! If you feel like you have a question about something because I left it out, let me know. If you feel like you get it, insofar as your knowledge of DNA forensics takes you, that's good, even if that's not very far.
Also I'm worried about the structure and the flow, not so much whether the jargon is comprehensible. I know it's hard to sort that out completely. This isn't meant to be for laymen, but it should be more accessible than a document meant just for specialists. If you have any kind of science background, most of it should make a bit of sense.
Abstract
Modern forensic DNA typing relies on software programs to interpret and display data obtained from capillary electrophoresis of DNA fragments. Common functions performed by the software include baselining of multiple signal channels, noise-filtering, and smoothing, which are preliminary to analytical functions such as identification of artifacts and allele calling. A forensic analyst typically reviews the results of the software analysis, which depends on user-defined parameters in conjunction with rules that handle variations. For example, a commonly used peak detection threshold for heterozygous loci is 100RFUs, whereas at homozygous loci the threshold is 200RFUs. The analyst makes allele calls (or confirms those made by the software) using information provided by the software which is the result of involved computations. In the case of the example, a peak height below the detection threshold is deemed statistically undependable. This peak height can vary based on what parameters are used, and it also depends on aspects of the signal processing algorithms that cannot be altered by the user. While validation of the most commonly used software is extensive, it focuses on parameter settings that reproduce correct results, but does not otherwise account for the underlying signal processing.
This study investigated the disparities in peak heights given by four software programs, Genemapper IDX (Applied Biosystems), Genemarker HID (who), FSSi3 (who), and TrueAllele (who), operating on the same raw data. We found highly correlated differences in peak heights between these programs. Using the same analysis parameters, this led to variations in the profile from a single sample as determined by the different programs when the profile contained alleles near the peak detection threshold. We conducted a simulation to test the efficacy of adjustments based on regression analysis of our peak height data. Using these adjustments, we were able to significantly reduce discrepancies in profiles exhibiting near-threshold peaks.
Thanks for your time. =)
|
Feel free to ignore my take on it. I'm a chemical engineer, but each scientific discipline has its own standards when it comes to literature. My overall opinion is that you included too much introductory information. Each paper has a section set aside for the introduction and it seems like you'd be repeating yourself a lot there (much to the chagrin of those actually reading the whole thing). If I were writing this it would look something like:
On July 15 2011 05:34 EatThePath wrote: Modern forensic DNA typing relies on software programs to interpret and display data obtained from capillary electrophoresis of DNA fragments. Common functions performed by the software include baselining of multiple signal channels, noise-filtering, and smoothing, which are preliminary to analytical functions such as identification of artifacts and allele calling. This study investigated the disparities in peak heights given by four software programs, Genemapper IDX (Applied Biosystems), Genemarker HID (who), FSSi3 (who), and TrueAllele (who), operating on the same raw data. We found highly correlated differences in peak heights between these programs. Using the same analysis parameters, this led to variations in the profile from a single sample as determined by the different programs when the profile contained alleles near the peak detection threshold. We conducted a simulation to test the efficacy of adjustments based on regression analysis of our peak height data. Using these adjustments, we were able to significantly reduce discrepancies in profiles exhibiting near-threshold peaks.
That's just a crude cut and paste. Since I'm not familiar with the subject I can't tell you if I cut out something really important. Then again, a lot of that material seemed like something someone knowledgable in the field would already know. Again, it's great to have in the body of the paper, but if those reading it are anything like my graduate adviser who tells us to read the abstract and conclusion and only then decide if it's worth out time, then they may not have the patience.
It's a good write-up, though. I have not composed a paper yet, but within the next year I'll be in your shoes!
|
I have a Ph.D. in Genetics. This is how I would word the genetic parts of your abstract.
"Modern forensic DNA analysis relies on software programs to interpret and display data obtained from electrophoresis of DNA fragments."
rest looks good
|
I know what you did last summer.
Just kidding I don't know enough to give a serious comment. Hopefully I made you smile though.
|
I can't comment on content, but I believe it is a little too long. If you go to google scholar and read random journal articles you will see that abstracts are typically a little shorter and I think some of the content in the first paragraph can be included in the introduction portion of your paper.
good job on doing summer research though, PhD programs <3 that.
edit: gak, what servius said ^^
|
On July 15 2011 05:52 Servius_Fulvius wrote:+ Show Spoiler +Feel free to ignore my take on it. I'm a chemical engineer, but each scientific discipline has its own standards when it comes to literature. My overall opinion is that you included too much introductory information. Each paper has a section set aside for the introduction and it seems like you'd be repeating yourself a lot there (much to the chagrin of those actually reading the whole thing). If I were writing this it would look something like: On July 15 2011 05:34 EatThePath wrote: Modern forensic DNA typing relies on software programs to interpret and display data obtained from capillary electrophoresis of DNA fragments. Common functions performed by the software include baselining of multiple signal channels, noise-filtering, and smoothing, which are preliminary to analytical functions such as identification of artifacts and allele calling. This study investigated the disparities in peak heights given by four software programs, Genemapper IDX (Applied Biosystems), Genemarker HID (who), FSSi3 (who), and TrueAllele (who), operating on the same raw data. We found highly correlated differences in peak heights between these programs. Using the same analysis parameters, this led to variations in the profile from a single sample as determined by the different programs when the profile contained alleles near the peak detection threshold. We conducted a simulation to test the efficacy of adjustments based on regression analysis of our peak height data. Using these adjustments, we were able to significantly reduce discrepancies in profiles exhibiting near-threshold peaks. That's just a crude cut and paste. Since I'm not familiar with the subject I can't tell you if I cut out something really important. Then again, a lot of that material seemed like something someone knowledgable in the field would already know. Again, it's great to have in the body of the paper, but if those reading it are anything like my graduate adviser who tells us to read the abstract and conclusion and only then decide if it's worth out time, then they may not have the patience. It's a good write-up, though. I have not composed a paper yet, but within the next year I'll be in your shoes!
In my job I read plenty of scientific abstracts, mainly cancer research. I agree with this post wholeheartedly. The abstract is intended to give the reader a rundown of what to expect when reading your paper. I would cut the intro section completely and focus on your second paragraph. Normally you want to outline
- what you did
- what you expected
- what you got
- why this is significant
- and what it means for future research.
If the person reading doesn't know about DNA typing then you can outline it in the introduction section of the paper.
In short keep your audience in the front of your mind. They are most likely experts who want new information, they probably do not want to be taught the basics.
As for the writing style, wording and flow. It seems fine to me. The flow is good, and the wording not overly specific. All in all a nicely written abstract, just refocus the content a little.
Good luck mate
|
the first paragraph is useless for an abstract. you want to tell the reader what the paper contains, not background information as why the stuff it contains is important
|
Thanks guys for the comments! Keep 'em coming if any more people happen to see this. ^^
I totally agree with all of you about the intro content. Yuck. This abstract won't actually be used for a paper, it's meant to introduce / summarize a poster (like you see in the halls at a lab building) that will be presented to other students/researchers of disparate backgrounds.
Personally I think it's pretty stupid but that's why I have all that intro junk in there, because the abstract is supposed to stand on its own in providing context and significance.
I'm glad it comes across okay. I'll definitely revise the second section to provide more "going forward from these results" comments.
Your input is invaluable folks. ;D
|
On July 15 2011 11:19 EatThePath wrote: Thanks guys for the comments! Keep 'em coming if any more people happen to see this. ^^
I totally agree with all of you about the intro content. Yuck. This abstract won't actually be used for a paper, it's meant to introduce / summarize a poster (like you see in the halls at a lab building) that will be presented to other students/researchers of disparate backgrounds.
Personally I think it's pretty stupid but that's why I have all that intro junk in there, because the abstract is supposed to stand on its own in providing context and significance.
I'm glad it comes across okay. I'll definitely revise the second section to provide more "going forward from these results" comments.
Your input is invaluable folks. ;D
Well in that case you will need some introduction stuff.
just a couple of sentences outlining what the background of DNA typing is. You want people to read your content, not the background. Are you giving a presentation with this or it will stand on its own?
|
On July 15 2011 11:39 Probulous wrote:Show nested quote +On July 15 2011 11:19 EatThePath wrote: Thanks guys for the comments! Keep 'em coming if any more people happen to see this. ^^
I totally agree with all of you about the intro content. Yuck. This abstract won't actually be used for a paper, it's meant to introduce / summarize a poster (like you see in the halls at a lab building) that will be presented to other students/researchers of disparate backgrounds.
Personally I think it's pretty stupid but that's why I have all that intro junk in there, because the abstract is supposed to stand on its own in providing context and significance.
I'm glad it comes across okay. I'll definitely revise the second section to provide more "going forward from these results" comments.
Your input is invaluable folks. ;D Well in that case you will need some introduction stuff. just a couple of sentences outlining what the background of DNA typing is. You want people to read your content, not the background. Are you giving a presentation with this or it will stand on its own?
I'm not entirely sure what the people running the program have in mind. I know that I'm giving a presentation to other people here with me (30-40 other interns at this school) where we share our research with each other but.. I really couldn't care less about that. I think that's mostly for practice, eh. They so far have "encouraged us strongly" to submit our research to student conferences that pander to this sort of thing. (Yay, more padding for graduate school aps). And I understand that many of these conferences weed you out initially based on abstract alone. Perhaps they also want to use the abstracts for some sort of review / wider publication survey of research kind of thing.
That's as much as I know about that.
To complicate matters, in my own personal case I'm also helping to write the actual paper, which will have different verbage than this and my poster. So I might be overemphasizing the differences between "the real one" and my "sharing one".
Also, to make the point of the study clear, I have to make this clear:
While validation of the most commonly used software is extensive, it focuses on parameter settings that reproduce correct results, but does not otherwise account for the underlying signal processing. And in order for that to mean anything to a wider audience I have to explain about the software a bit.
I don't want to bother you guys with the nitty gritty reasoning of revising this to death. I'll go into it more if you want.
|
On July 15 2011 06:41 Probe1 wrote:I know what you did last summer. Just kidding I don't know enough to give a serious comment. Hopefully I made you smile though.
You did, hehe.
|
On July 15 2011 12:24 EatThePath wrote:Show nested quote +On July 15 2011 11:39 Probulous wrote:On July 15 2011 11:19 EatThePath wrote: Thanks guys for the comments! Keep 'em coming if any more people happen to see this. ^^
I totally agree with all of you about the intro content. Yuck. This abstract won't actually be used for a paper, it's meant to introduce / summarize a poster (like you see in the halls at a lab building) that will be presented to other students/researchers of disparate backgrounds.
Personally I think it's pretty stupid but that's why I have all that intro junk in there, because the abstract is supposed to stand on its own in providing context and significance.
I'm glad it comes across okay. I'll definitely revise the second section to provide more "going forward from these results" comments.
Your input is invaluable folks. ;D Well in that case you will need some introduction stuff. just a couple of sentences outlining what the background of DNA typing is. You want people to read your content, not the background. Are you giving a presentation with this or it will stand on its own? I'm not entirely sure what the people running the program have in mind. I know that I'm giving a presentation to other people here with me (30-40 other interns at this school) where we share our research with each other but.. I really couldn't care less about that. I think that's mostly for practice, eh. They so far have "encouraged us strongly" to submit our research to student conferences that pander to this sort of thing. (Yay, more padding for graduate school aps). And I understand that many of these conferences weed you out initially based on abstract alone. Perhaps they also want to use the abstracts for some sort of review / wider publication survey of research kind of thing. That's as much as I know about that. To complicate matters, in my own personal case I'm also helping to write the actual paper, which will have different verbage than this and my poster. So I might be overemphasizing the differences between "the real one" and my "sharing one". Also, to make the point of the study clear, I have to make this clear: Show nested quote +While validation of the most commonly used software is extensive, it focuses on parameter settings that reproduce correct results, but does not otherwise account for the underlying signal processing. And in order for that to mean anything to a wider audience I have to explain about the software a bit. I don't want to bother you guys with the nitty gritty reasoning of revising this to death. I'll go into it more if you want.
Well this complicates things I guess. Still I would suggest using the paper abstract as your base and the simplify for the student audience. If these conferences are going to be used for your post-grad entry you want to be seen as scientific first and foremost. You're pretty luck to be an author on a paper so early. That's a big bonus.
As for the specifics /details. My degree was in bioinformatics so this is right up my alley When you say signal processing, you mean the translation of the raw signals into base pairs? If so, interesting area of comparison, I guess this is a neglected area of validation. Using the peak heights is interesting too. Were any of your difference significant enough to warrant a rethink of the software itself.
I mean if you are getting different sequences with different software packages, how do you interpet your results...
|
On July 15 2011 12:38 Probulous wrote:Show nested quote +On July 15 2011 12:24 EatThePath wrote:On July 15 2011 11:39 Probulous wrote:On July 15 2011 11:19 EatThePath wrote: Thanks guys for the comments! Keep 'em coming if any more people happen to see this. ^^
I totally agree with all of you about the intro content. Yuck. This abstract won't actually be used for a paper, it's meant to introduce / summarize a poster (like you see in the halls at a lab building) that will be presented to other students/researchers of disparate backgrounds.
Personally I think it's pretty stupid but that's why I have all that intro junk in there, because the abstract is supposed to stand on its own in providing context and significance.
I'm glad it comes across okay. I'll definitely revise the second section to provide more "going forward from these results" comments.
Your input is invaluable folks. ;D Well in that case you will need some introduction stuff. just a couple of sentences outlining what the background of DNA typing is. You want people to read your content, not the background. Are you giving a presentation with this or it will stand on its own? I'm not entirely sure what the people running the program have in mind. I know that I'm giving a presentation to other people here with me (30-40 other interns at this school) where we share our research with each other but.. I really couldn't care less about that. I think that's mostly for practice, eh. They so far have "encouraged us strongly" to submit our research to student conferences that pander to this sort of thing. (Yay, more padding for graduate school aps). And I understand that many of these conferences weed you out initially based on abstract alone. Perhaps they also want to use the abstracts for some sort of review / wider publication survey of research kind of thing. That's as much as I know about that. To complicate matters, in my own personal case I'm also helping to write the actual paper, which will have different verbage than this and my poster. So I might be overemphasizing the differences between "the real one" and my "sharing one". Also, to make the point of the study clear, I have to make this clear: While validation of the most commonly used software is extensive, it focuses on parameter settings that reproduce correct results, but does not otherwise account for the underlying signal processing. And in order for that to mean anything to a wider audience I have to explain about the software a bit. I don't want to bother you guys with the nitty gritty reasoning of revising this to death. I'll go into it more if you want. Well this complicates things I guess. Still I would suggest using the paper abstract as your base and the simplify for the student audience. If these conferences are going to be used for your post-grad entry you want to be seen as scientific first and foremost. You're pretty luck to be an author on a paper so early. That's a big bonus. As for the specifics /details. My degree was in bioinformatics so this is right up my alley When you say signal processing, you mean the translation of the raw signals into base pairs? If so, interesting area of comparison, I guess this is a neglected area of validation. Using the peak heights is interesting too. Were any of your difference significant enough to warrant a rethink of the software itself. I mean if you are getting different sequences with different software packages, how do you interpet your results...
Oh hai you know what I'm talking about. ;D
Yes I'm super excited to be an author! I'll be revising with my mentor / graduate student who is helping me too, so I think with all this assistance I can strike the right balance. I am thinking now to err on the side of straight and to-the-point.
So... I am dealing with STR profiling where you essentially count up the repeats to by measuring fragment length and assign an allele number. To do this you run a size standard to sync up BP size to time in the run, and you also run a ladder which has every allele to provide you the software with "bins" so it knows what allele number to call based on where the peak shows up.
The peak heights don't matter at all most of the time. If they are super high it can cause artifacts and if they are unexpectedly low you probably had a problem during your PCR or whatnot. Also, if you have low copy to start with or degraded DNA (like resumed remains) you'll have low peaks cause you just don't have that much to start with so it's low signal. So you get situations where your peaks are too low and they don't meet the detection threshold. In this case you throw out that locus as inconclusive which gives you a partial profile. In analogy, you have a partial fingerprint.
Almost everyone in the US uses the same piece of software from the company whose instruments almost everyone uses too. But there are new programs entering the market and they are not being looked at a whole lot yet. So my study found that different programs have slightly higher or lower peak heights overall, which is the result of slightly different approaches in the initial raw data processing. They all give you the same profile under normal circumstances because that just depends on the sizing the peaks, aka fragment length (not height Y, just X position). When you have low peaks (for whatever reason) you can have a situation where in one program, it is just under the threshold so you don't include that locus, and in another program it meets the threshold and you include the locus. So you have differing completeness of profiles which are concordant, but one has more information. A given locus can represent discriminatory power (likelihood of matching a random human) of like 1/1000. If you gain or lose two loci your odds you tell the jury in court can differ by a million-fold.
That's a dramatic way to put it but it illustrates the point--you want concordance in your profiling. We tested the solution of just adjusting the threshold based on our regressions. Which works pretty well, it helps a lot. But at a certain level your signal processing is just not the same and there's no way around that. Big picture, it's not a huge deal but it's something that people should think about.
Wow I gassed on. Hope you find it interesting.
|
On July 15 2011 13:10 EatThePath wrote:Show nested quote +On July 15 2011 12:38 Probulous wrote:On July 15 2011 12:24 EatThePath wrote:On July 15 2011 11:39 Probulous wrote:On July 15 2011 11:19 EatThePath wrote: Thanks guys for the comments! Keep 'em coming if any more people happen to see this. ^^
I totally agree with all of you about the intro content. Yuck. This abstract won't actually be used for a paper, it's meant to introduce / summarize a poster (like you see in the halls at a lab building) that will be presented to other students/researchers of disparate backgrounds.
Personally I think it's pretty stupid but that's why I have all that intro junk in there, because the abstract is supposed to stand on its own in providing context and significance.
I'm glad it comes across okay. I'll definitely revise the second section to provide more "going forward from these results" comments.
Your input is invaluable folks. ;D Well in that case you will need some introduction stuff. just a couple of sentences outlining what the background of DNA typing is. You want people to read your content, not the background. Are you giving a presentation with this or it will stand on its own? I'm not entirely sure what the people running the program have in mind. I know that I'm giving a presentation to other people here with me (30-40 other interns at this school) where we share our research with each other but.. I really couldn't care less about that. I think that's mostly for practice, eh. They so far have "encouraged us strongly" to submit our research to student conferences that pander to this sort of thing. (Yay, more padding for graduate school aps). And I understand that many of these conferences weed you out initially based on abstract alone. Perhaps they also want to use the abstracts for some sort of review / wider publication survey of research kind of thing. That's as much as I know about that. To complicate matters, in my own personal case I'm also helping to write the actual paper, which will have different verbage than this and my poster. So I might be overemphasizing the differences between "the real one" and my "sharing one". Also, to make the point of the study clear, I have to make this clear: While validation of the most commonly used software is extensive, it focuses on parameter settings that reproduce correct results, but does not otherwise account for the underlying signal processing. And in order for that to mean anything to a wider audience I have to explain about the software a bit. I don't want to bother you guys with the nitty gritty reasoning of revising this to death. I'll go into it more if you want. Well this complicates things I guess. Still I would suggest using the paper abstract as your base and the simplify for the student audience. If these conferences are going to be used for your post-grad entry you want to be seen as scientific first and foremost. You're pretty luck to be an author on a paper so early. That's a big bonus. As for the specifics /details. My degree was in bioinformatics so this is right up my alley When you say signal processing, you mean the translation of the raw signals into base pairs? If so, interesting area of comparison, I guess this is a neglected area of validation. Using the peak heights is interesting too. Were any of your difference significant enough to warrant a rethink of the software itself. I mean if you are getting different sequences with different software packages, how do you interpet your results... Oh hai you know what I'm talking about. ;D Yes I'm super excited to be an author! I'll be revising with my mentor / graduate student who is helping me too, so I think with all this assistance I can strike the right balance. I am thinking now to err on the side of straight and to-the-point. So... I am dealing with STR profiling where you essentially count up the repeats to by measuring fragment length and assign an allele number. To do this you run a size standard to sync up BP size to time in the run, and you also run a ladder which has every allele to provide you the software with "bins" so it knows what allele number to call based on where the peak shows up. The peak heights don't matter at all most of the time. If they are super high it can cause artifacts and if they are unexpectedly low you probably had a problem during your PCR or whatnot. Also, if you have low copy to start with or degraded DNA (like resumed remains) you'll have low peaks cause you just don't have that much to start with so it's low signal. So you get situations where your peaks are too low and they don't meet the detection threshold. In this case you throw out that locus as inconclusive which gives you a partial profile. In analogy, you have a partial fingerprint. Almost everyone in the US uses the same piece of software from the company whose instruments almost everyone uses too. But there are new programs entering the market and they are not being looked at a whole lot yet. So my study found that different programs have slightly higher or lower peak heights overall, which is the result of slightly different approaches in the initial raw data processing. They all give you the same profile under normal circumstances because that just depends on the sizing the peaks, aka fragment length (not height Y, just X position). When you have low peaks (for whatever reason) you can have a situation where in one program, it is just under the threshold so you don't include that locus, and in another program it meets the threshold and you include the locus. So you have differing completeness of profiles which are concordant, but one has more information. A given locus can represent discriminatory power (likelihood of matching a random human) of like 1/1000. If you gain or lose two loci your odds you tell the jury in court can differ by a million-fold. That's a dramatic way to put it but it illustrates the point--you want concordance in your profiling. We tested the solution of just adjusting the threshold based on our regressions. Which works pretty well, it helps a lot. But at a certain level your signal processing is just not the same and there's no way around that. Big picture, it's not a huge deal but it's something that people should think about. Wow I gassed on. Hope you find it interesting.
Yeah that was what I expected. One or two loci isn't going to break the bank unless you are really low on your initial sample, in which case there are other sources of error. It is an interesting area to look at, simply because most people just assume that what they read is correct.
Well done on the work. Good luck with the paper, any ideas what you want to do after this?
Edit: Oh and you will soon learn that on TL, there is always someone who knows what you are talking about. Doesn't matter the subject. There are some serious brains on this site And no I don't consider myself a brain.
Edit 2: Turns out you have been here way longer than I have, so now I look like an idiot
|
On July 15 2011 13:21 Probulous wrote:Show nested quote +On July 15 2011 13:10 EatThePath wrote:On July 15 2011 12:38 Probulous wrote:On July 15 2011 12:24 EatThePath wrote:On July 15 2011 11:39 Probulous wrote:On July 15 2011 11:19 EatThePath wrote: Thanks guys for the comments! Keep 'em coming if any more people happen to see this. ^^
I totally agree with all of you about the intro content. Yuck. This abstract won't actually be used for a paper, it's meant to introduce / summarize a poster (like you see in the halls at a lab building) that will be presented to other students/researchers of disparate backgrounds.
Personally I think it's pretty stupid but that's why I have all that intro junk in there, because the abstract is supposed to stand on its own in providing context and significance.
I'm glad it comes across okay. I'll definitely revise the second section to provide more "going forward from these results" comments.
Your input is invaluable folks. ;D Well in that case you will need some introduction stuff. just a couple of sentences outlining what the background of DNA typing is. You want people to read your content, not the background. Are you giving a presentation with this or it will stand on its own? I'm not entirely sure what the people running the program have in mind. I know that I'm giving a presentation to other people here with me (30-40 other interns at this school) where we share our research with each other but.. I really couldn't care less about that. I think that's mostly for practice, eh. They so far have "encouraged us strongly" to submit our research to student conferences that pander to this sort of thing. (Yay, more padding for graduate school aps). And I understand that many of these conferences weed you out initially based on abstract alone. Perhaps they also want to use the abstracts for some sort of review / wider publication survey of research kind of thing. That's as much as I know about that. To complicate matters, in my own personal case I'm also helping to write the actual paper, which will have different verbage than this and my poster. So I might be overemphasizing the differences between "the real one" and my "sharing one". Also, to make the point of the study clear, I have to make this clear: While validation of the most commonly used software is extensive, it focuses on parameter settings that reproduce correct results, but does not otherwise account for the underlying signal processing. And in order for that to mean anything to a wider audience I have to explain about the software a bit. I don't want to bother you guys with the nitty gritty reasoning of revising this to death. I'll go into it more if you want. Well this complicates things I guess. Still I would suggest using the paper abstract as your base and the simplify for the student audience. If these conferences are going to be used for your post-grad entry you want to be seen as scientific first and foremost. You're pretty luck to be an author on a paper so early. That's a big bonus. As for the specifics /details. My degree was in bioinformatics so this is right up my alley When you say signal processing, you mean the translation of the raw signals into base pairs? If so, interesting area of comparison, I guess this is a neglected area of validation. Using the peak heights is interesting too. Were any of your difference significant enough to warrant a rethink of the software itself. I mean if you are getting different sequences with different software packages, how do you interpet your results... Oh hai you know what I'm talking about. ;D Yes I'm super excited to be an author! I'll be revising with my mentor / graduate student who is helping me too, so I think with all this assistance I can strike the right balance. I am thinking now to err on the side of straight and to-the-point. So... I am dealing with STR profiling where you essentially count up the repeats to by measuring fragment length and assign an allele number. To do this you run a size standard to sync up BP size to time in the run, and you also run a ladder which has every allele to provide you the software with "bins" so it knows what allele number to call based on where the peak shows up. The peak heights don't matter at all most of the time. If they are super high it can cause artifacts and if they are unexpectedly low you probably had a problem during your PCR or whatnot. Also, if you have low copy to start with or degraded DNA (like resumed remains) you'll have low peaks cause you just don't have that much to start with so it's low signal. So you get situations where your peaks are too low and they don't meet the detection threshold. In this case you throw out that locus as inconclusive which gives you a partial profile. In analogy, you have a partial fingerprint. Almost everyone in the US uses the same piece of software from the company whose instruments almost everyone uses too. But there are new programs entering the market and they are not being looked at a whole lot yet. So my study found that different programs have slightly higher or lower peak heights overall, which is the result of slightly different approaches in the initial raw data processing. They all give you the same profile under normal circumstances because that just depends on the sizing the peaks, aka fragment length (not height Y, just X position). When you have low peaks (for whatever reason) you can have a situation where in one program, it is just under the threshold so you don't include that locus, and in another program it meets the threshold and you include the locus. So you have differing completeness of profiles which are concordant, but one has more information. A given locus can represent discriminatory power (likelihood of matching a random human) of like 1/1000. If you gain or lose two loci your odds you tell the jury in court can differ by a million-fold. That's a dramatic way to put it but it illustrates the point--you want concordance in your profiling. We tested the solution of just adjusting the threshold based on our regressions. Which works pretty well, it helps a lot. But at a certain level your signal processing is just not the same and there's no way around that. Big picture, it's not a huge deal but it's something that people should think about. Wow I gassed on. Hope you find it interesting. Yeah that was what I expected. One or two loci isn't going to break the bank unless you are really low on your initial sample, in which case there are other sources of error. It is an interesting area to look at, simply because most people just assume that what they read is correct. Well done on the work. Good luck with the paper, any ideas what you want to do after this? Edit: Oh and you will soon learn that on TL, there is always someone who knows what you are talking about. Doesn't matter the subject. There are some serious brains on this site And no I don't consider myself a brain.
Exactly, people just go with the program. To be fair the demand for profiling (not just in a criminal forensic application) is high and growing, so there will be lots of people who are basically technicians. In that kind of world we should make sure all the automated stuff works properly first. And in my experience the forensic community is utterly dedicated to that, even if they are pulled and pressured from different angles. Like state mandated caseload processing rates.
What do I want to do? I'm going to make artificial intelligences like Cortana from Halo.
edit. lol no worries I barely consider myself a TLer. I have seen some impressive folks though, did you see that thread "what is your occupation / aspiration?" I was blown away.
|
On July 15 2011 13:28 EatThePath wrote:Show nested quote +On July 15 2011 13:21 Probulous wrote:On July 15 2011 13:10 EatThePath wrote:On July 15 2011 12:38 Probulous wrote:On July 15 2011 12:24 EatThePath wrote:On July 15 2011 11:39 Probulous wrote:On July 15 2011 11:19 EatThePath wrote: Thanks guys for the comments! Keep 'em coming if any more people happen to see this. ^^
I totally agree with all of you about the intro content. Yuck. This abstract won't actually be used for a paper, it's meant to introduce / summarize a poster (like you see in the halls at a lab building) that will be presented to other students/researchers of disparate backgrounds.
Personally I think it's pretty stupid but that's why I have all that intro junk in there, because the abstract is supposed to stand on its own in providing context and significance.
I'm glad it comes across okay. I'll definitely revise the second section to provide more "going forward from these results" comments.
Your input is invaluable folks. ;D Well in that case you will need some introduction stuff. just a couple of sentences outlining what the background of DNA typing is. You want people to read your content, not the background. Are you giving a presentation with this or it will stand on its own? I'm not entirely sure what the people running the program have in mind. I know that I'm giving a presentation to other people here with me (30-40 other interns at this school) where we share our research with each other but.. I really couldn't care less about that. I think that's mostly for practice, eh. They so far have "encouraged us strongly" to submit our research to student conferences that pander to this sort of thing. (Yay, more padding for graduate school aps). And I understand that many of these conferences weed you out initially based on abstract alone. Perhaps they also want to use the abstracts for some sort of review / wider publication survey of research kind of thing. That's as much as I know about that. To complicate matters, in my own personal case I'm also helping to write the actual paper, which will have different verbage than this and my poster. So I might be overemphasizing the differences between "the real one" and my "sharing one". Also, to make the point of the study clear, I have to make this clear: While validation of the most commonly used software is extensive, it focuses on parameter settings that reproduce correct results, but does not otherwise account for the underlying signal processing. And in order for that to mean anything to a wider audience I have to explain about the software a bit. I don't want to bother you guys with the nitty gritty reasoning of revising this to death. I'll go into it more if you want. Well this complicates things I guess. Still I would suggest using the paper abstract as your base and the simplify for the student audience. If these conferences are going to be used for your post-grad entry you want to be seen as scientific first and foremost. You're pretty luck to be an author on a paper so early. That's a big bonus. As for the specifics /details. My degree was in bioinformatics so this is right up my alley When you say signal processing, you mean the translation of the raw signals into base pairs? If so, interesting area of comparison, I guess this is a neglected area of validation. Using the peak heights is interesting too. Were any of your difference significant enough to warrant a rethink of the software itself. I mean if you are getting different sequences with different software packages, how do you interpet your results... Oh hai you know what I'm talking about. ;D Yes I'm super excited to be an author! I'll be revising with my mentor / graduate student who is helping me too, so I think with all this assistance I can strike the right balance. I am thinking now to err on the side of straight and to-the-point. So... I am dealing with STR profiling where you essentially count up the repeats to by measuring fragment length and assign an allele number. To do this you run a size standard to sync up BP size to time in the run, and you also run a ladder which has every allele to provide you the software with "bins" so it knows what allele number to call based on where the peak shows up. The peak heights don't matter at all most of the time. If they are super high it can cause artifacts and if they are unexpectedly low you probably had a problem during your PCR or whatnot. Also, if you have low copy to start with or degraded DNA (like resumed remains) you'll have low peaks cause you just don't have that much to start with so it's low signal. So you get situations where your peaks are too low and they don't meet the detection threshold. In this case you throw out that locus as inconclusive which gives you a partial profile. In analogy, you have a partial fingerprint. Almost everyone in the US uses the same piece of software from the company whose instruments almost everyone uses too. But there are new programs entering the market and they are not being looked at a whole lot yet. So my study found that different programs have slightly higher or lower peak heights overall, which is the result of slightly different approaches in the initial raw data processing. They all give you the same profile under normal circumstances because that just depends on the sizing the peaks, aka fragment length (not height Y, just X position). When you have low peaks (for whatever reason) you can have a situation where in one program, it is just under the threshold so you don't include that locus, and in another program it meets the threshold and you include the locus. So you have differing completeness of profiles which are concordant, but one has more information. A given locus can represent discriminatory power (likelihood of matching a random human) of like 1/1000. If you gain or lose two loci your odds you tell the jury in court can differ by a million-fold. That's a dramatic way to put it but it illustrates the point--you want concordance in your profiling. We tested the solution of just adjusting the threshold based on our regressions. Which works pretty well, it helps a lot. But at a certain level your signal processing is just not the same and there's no way around that. Big picture, it's not a huge deal but it's something that people should think about. Wow I gassed on. Hope you find it interesting. Yeah that was what I expected. One or two loci isn't going to break the bank unless you are really low on your initial sample, in which case there are other sources of error. It is an interesting area to look at, simply because most people just assume that what they read is correct. Well done on the work. Good luck with the paper, any ideas what you want to do after this? Edit: Oh and you will soon learn that on TL, there is always someone who knows what you are talking about. Doesn't matter the subject. There are some serious brains on this site And no I don't consider myself a brain. Exactly, people just go with the program. To be fair the demand for profiling (not just in a criminal forensic application) is high and growing, so there will be lots of people who are basically technicians. In that kind of world we should make sure all the automated stuff works properly first. And in my experience the forensic community is utterly dedicated to that, even if they are pulled and pressured from different angles. Like state mandated caseload processing rates. What do I want to do? I'm going to make artificial intelligences like Cortana from Halo. edit. lol no worries I barely consider myself a TLer. I have seen some impressive folks though, did you see that thread "what is your occupation / aspiration?" I was blown away.
Yup, makes my career seems really mundane. Oh well, my job was never going to define me.
You are certainly reaching for the stars. I don't play halo, so I am afraid that reference just flew by me.
Gvien your direction I would imagine you've read a bit of Kurzweil's theory on the singularity. You want to be a part of developing that? It is pretty mind-bending, not sure I could give up my inferior fleshiness.
|
On July 15 2011 13:34 Probulous wrote:Show nested quote +On July 15 2011 13:28 EatThePath wrote:On July 15 2011 13:21 Probulous wrote:On July 15 2011 13:10 EatThePath wrote:On July 15 2011 12:38 Probulous wrote:On July 15 2011 12:24 EatThePath wrote:On July 15 2011 11:39 Probulous wrote:On July 15 2011 11:19 EatThePath wrote: Thanks guys for the comments! Keep 'em coming if any more people happen to see this. ^^
I totally agree with all of you about the intro content. Yuck. This abstract won't actually be used for a paper, it's meant to introduce / summarize a poster (like you see in the halls at a lab building) that will be presented to other students/researchers of disparate backgrounds.
Personally I think it's pretty stupid but that's why I have all that intro junk in there, because the abstract is supposed to stand on its own in providing context and significance.
I'm glad it comes across okay. I'll definitely revise the second section to provide more "going forward from these results" comments.
Your input is invaluable folks. ;D Well in that case you will need some introduction stuff. just a couple of sentences outlining what the background of DNA typing is. You want people to read your content, not the background. Are you giving a presentation with this or it will stand on its own? I'm not entirely sure what the people running the program have in mind. I know that I'm giving a presentation to other people here with me (30-40 other interns at this school) where we share our research with each other but.. I really couldn't care less about that. I think that's mostly for practice, eh. They so far have "encouraged us strongly" to submit our research to student conferences that pander to this sort of thing. (Yay, more padding for graduate school aps). And I understand that many of these conferences weed you out initially based on abstract alone. Perhaps they also want to use the abstracts for some sort of review / wider publication survey of research kind of thing. That's as much as I know about that. To complicate matters, in my own personal case I'm also helping to write the actual paper, which will have different verbage than this and my poster. So I might be overemphasizing the differences between "the real one" and my "sharing one". Also, to make the point of the study clear, I have to make this clear: While validation of the most commonly used software is extensive, it focuses on parameter settings that reproduce correct results, but does not otherwise account for the underlying signal processing. And in order for that to mean anything to a wider audience I have to explain about the software a bit. I don't want to bother you guys with the nitty gritty reasoning of revising this to death. I'll go into it more if you want. Well this complicates things I guess. Still I would suggest using the paper abstract as your base and the simplify for the student audience. If these conferences are going to be used for your post-grad entry you want to be seen as scientific first and foremost. You're pretty luck to be an author on a paper so early. That's a big bonus. As for the specifics /details. My degree was in bioinformatics so this is right up my alley When you say signal processing, you mean the translation of the raw signals into base pairs? If so, interesting area of comparison, I guess this is a neglected area of validation. Using the peak heights is interesting too. Were any of your difference significant enough to warrant a rethink of the software itself. I mean if you are getting different sequences with different software packages, how do you interpet your results... Oh hai you know what I'm talking about. ;D Yes I'm super excited to be an author! I'll be revising with my mentor / graduate student who is helping me too, so I think with all this assistance I can strike the right balance. I am thinking now to err on the side of straight and to-the-point. So... I am dealing with STR profiling where you essentially count up the repeats to by measuring fragment length and assign an allele number. To do this you run a size standard to sync up BP size to time in the run, and you also run a ladder which has every allele to provide you the software with "bins" so it knows what allele number to call based on where the peak shows up. The peak heights don't matter at all most of the time. If they are super high it can cause artifacts and if they are unexpectedly low you probably had a problem during your PCR or whatnot. Also, if you have low copy to start with or degraded DNA (like resumed remains) you'll have low peaks cause you just don't have that much to start with so it's low signal. So you get situations where your peaks are too low and they don't meet the detection threshold. In this case you throw out that locus as inconclusive which gives you a partial profile. In analogy, you have a partial fingerprint. Almost everyone in the US uses the same piece of software from the company whose instruments almost everyone uses too. But there are new programs entering the market and they are not being looked at a whole lot yet. So my study found that different programs have slightly higher or lower peak heights overall, which is the result of slightly different approaches in the initial raw data processing. They all give you the same profile under normal circumstances because that just depends on the sizing the peaks, aka fragment length (not height Y, just X position). When you have low peaks (for whatever reason) you can have a situation where in one program, it is just under the threshold so you don't include that locus, and in another program it meets the threshold and you include the locus. So you have differing completeness of profiles which are concordant, but one has more information. A given locus can represent discriminatory power (likelihood of matching a random human) of like 1/1000. If you gain or lose two loci your odds you tell the jury in court can differ by a million-fold. That's a dramatic way to put it but it illustrates the point--you want concordance in your profiling. We tested the solution of just adjusting the threshold based on our regressions. Which works pretty well, it helps a lot. But at a certain level your signal processing is just not the same and there's no way around that. Big picture, it's not a huge deal but it's something that people should think about. Wow I gassed on. Hope you find it interesting. Yeah that was what I expected. One or two loci isn't going to break the bank unless you are really low on your initial sample, in which case there are other sources of error. It is an interesting area to look at, simply because most people just assume that what they read is correct. Well done on the work. Good luck with the paper, any ideas what you want to do after this? Edit: Oh and you will soon learn that on TL, there is always someone who knows what you are talking about. Doesn't matter the subject. There are some serious brains on this site And no I don't consider myself a brain. Exactly, people just go with the program. To be fair the demand for profiling (not just in a criminal forensic application) is high and growing, so there will be lots of people who are basically technicians. In that kind of world we should make sure all the automated stuff works properly first. And in my experience the forensic community is utterly dedicated to that, even if they are pulled and pressured from different angles. Like state mandated caseload processing rates. What do I want to do? I'm going to make artificial intelligences like Cortana from Halo. edit. lol no worries I barely consider myself a TLer. I have seen some impressive folks though, did you see that thread "what is your occupation / aspiration?" I was blown away. Yup, makes my career seems really mundane. Oh well, my job was never going to define me. You are certainly reaching for the stars. I don't play halo, so I am afraid that reference just flew by me. Gvien your direction I would imagine you've read a bit of Kurzweil's theory on the singularity. You want to be a part of developing that? It is pretty mind-bending, not sure I could give up my inferior fleshiness.
Nah man, there's every chance I'll end up "not being defined by my career" but that's what I'm shooting for. Finally. After 8 years without a university degree I find my calling the last few months. I haven't heard of Kurzweil before but a quick googling shows me things I can jive with. I just finished reading "The User Illusion" by Tor Norretranders which synthesises basically everything I hold dear that I've come to know about math science and the universe. It's a thesis about consciousness, but it's based on physics, ultimately.
I don't go in for the sci fi motivation "lets build the future" kind of mentality even though I'm partial to it, very. For me it's more about understanding the universe and where that takes us as humans. The salient point there is that information is a physical quantity. The ramifications of that are pretty intense. I would love to tell you about it at length, lol. ;D
Maybe there will be a blog series. Or a skype science club meeting when it's not midnight.
What are you doing with a bioinformatics degree that has you reading abstracts?
|
On July 15 2011 13:45 EatThePath wrote:Show nested quote +On July 15 2011 13:34 Probulous wrote:On July 15 2011 13:28 EatThePath wrote:On July 15 2011 13:21 Probulous wrote:On July 15 2011 13:10 EatThePath wrote:On July 15 2011 12:38 Probulous wrote:On July 15 2011 12:24 EatThePath wrote:On July 15 2011 11:39 Probulous wrote:On July 15 2011 11:19 EatThePath wrote: Thanks guys for the comments! Keep 'em coming if any more people happen to see this. ^^
I totally agree with all of you about the intro content. Yuck. This abstract won't actually be used for a paper, it's meant to introduce / summarize a poster (like you see in the halls at a lab building) that will be presented to other students/researchers of disparate backgrounds.
Personally I think it's pretty stupid but that's why I have all that intro junk in there, because the abstract is supposed to stand on its own in providing context and significance.
I'm glad it comes across okay. I'll definitely revise the second section to provide more "going forward from these results" comments.
Your input is invaluable folks. ;D Well in that case you will need some introduction stuff. just a couple of sentences outlining what the background of DNA typing is. You want people to read your content, not the background. Are you giving a presentation with this or it will stand on its own? I'm not entirely sure what the people running the program have in mind. I know that I'm giving a presentation to other people here with me (30-40 other interns at this school) where we share our research with each other but.. I really couldn't care less about that. I think that's mostly for practice, eh. They so far have "encouraged us strongly" to submit our research to student conferences that pander to this sort of thing. (Yay, more padding for graduate school aps). And I understand that many of these conferences weed you out initially based on abstract alone. Perhaps they also want to use the abstracts for some sort of review / wider publication survey of research kind of thing. That's as much as I know about that. To complicate matters, in my own personal case I'm also helping to write the actual paper, which will have different verbage than this and my poster. So I might be overemphasizing the differences between "the real one" and my "sharing one". Also, to make the point of the study clear, I have to make this clear: While validation of the most commonly used software is extensive, it focuses on parameter settings that reproduce correct results, but does not otherwise account for the underlying signal processing. And in order for that to mean anything to a wider audience I have to explain about the software a bit. I don't want to bother you guys with the nitty gritty reasoning of revising this to death. I'll go into it more if you want. Well this complicates things I guess. Still I would suggest using the paper abstract as your base and the simplify for the student audience. If these conferences are going to be used for your post-grad entry you want to be seen as scientific first and foremost. You're pretty luck to be an author on a paper so early. That's a big bonus. As for the specifics /details. My degree was in bioinformatics so this is right up my alley When you say signal processing, you mean the translation of the raw signals into base pairs? If so, interesting area of comparison, I guess this is a neglected area of validation. Using the peak heights is interesting too. Were any of your difference significant enough to warrant a rethink of the software itself. I mean if you are getting different sequences with different software packages, how do you interpet your results... Oh hai you know what I'm talking about. ;D Yes I'm super excited to be an author! I'll be revising with my mentor / graduate student who is helping me too, so I think with all this assistance I can strike the right balance. I am thinking now to err on the side of straight and to-the-point. So... I am dealing with STR profiling where you essentially count up the repeats to by measuring fragment length and assign an allele number. To do this you run a size standard to sync up BP size to time in the run, and you also run a ladder which has every allele to provide you the software with "bins" so it knows what allele number to call based on where the peak shows up. The peak heights don't matter at all most of the time. If they are super high it can cause artifacts and if they are unexpectedly low you probably had a problem during your PCR or whatnot. Also, if you have low copy to start with or degraded DNA (like resumed remains) you'll have low peaks cause you just don't have that much to start with so it's low signal. So you get situations where your peaks are too low and they don't meet the detection threshold. In this case you throw out that locus as inconclusive which gives you a partial profile. In analogy, you have a partial fingerprint. Almost everyone in the US uses the same piece of software from the company whose instruments almost everyone uses too. But there are new programs entering the market and they are not being looked at a whole lot yet. So my study found that different programs have slightly higher or lower peak heights overall, which is the result of slightly different approaches in the initial raw data processing. They all give you the same profile under normal circumstances because that just depends on the sizing the peaks, aka fragment length (not height Y, just X position). When you have low peaks (for whatever reason) you can have a situation where in one program, it is just under the threshold so you don't include that locus, and in another program it meets the threshold and you include the locus. So you have differing completeness of profiles which are concordant, but one has more information. A given locus can represent discriminatory power (likelihood of matching a random human) of like 1/1000. If you gain or lose two loci your odds you tell the jury in court can differ by a million-fold. That's a dramatic way to put it but it illustrates the point--you want concordance in your profiling. We tested the solution of just adjusting the threshold based on our regressions. Which works pretty well, it helps a lot. But at a certain level your signal processing is just not the same and there's no way around that. Big picture, it's not a huge deal but it's something that people should think about. Wow I gassed on. Hope you find it interesting. Yeah that was what I expected. One or two loci isn't going to break the bank unless you are really low on your initial sample, in which case there are other sources of error. It is an interesting area to look at, simply because most people just assume that what they read is correct. Well done on the work. Good luck with the paper, any ideas what you want to do after this? Edit: Oh and you will soon learn that on TL, there is always someone who knows what you are talking about. Doesn't matter the subject. There are some serious brains on this site And no I don't consider myself a brain. Exactly, people just go with the program. To be fair the demand for profiling (not just in a criminal forensic application) is high and growing, so there will be lots of people who are basically technicians. In that kind of world we should make sure all the automated stuff works properly first. And in my experience the forensic community is utterly dedicated to that, even if they are pulled and pressured from different angles. Like state mandated caseload processing rates. What do I want to do? I'm going to make artificial intelligences like Cortana from Halo. edit. lol no worries I barely consider myself a TLer. I have seen some impressive folks though, did you see that thread "what is your occupation / aspiration?" I was blown away. Yup, makes my career seems really mundane. Oh well, my job was never going to define me. You are certainly reaching for the stars. I don't play halo, so I am afraid that reference just flew by me. Gvien your direction I would imagine you've read a bit of Kurzweil's theory on the singularity. You want to be a part of developing that? It is pretty mind-bending, not sure I could give up my inferior fleshiness. Nah man, there's every chance I'll end up "not being defined by my career" but that's what I'm shooting for. Finally. After 8 years without a university degree I find my calling the last few months. I haven't heard of Kurzweil before but a quick googling shows me things I can jive with. I just finished reading "The User Illusion" by Tor Norretranders which synthesises basically everything I hold dear that I've come to know about math science and the universe. It's a thesis about consciousness, but it's based on physics, ultimately. I don't go in for the sci fi motivation "lets build the future" kind of mentality even though I'm partial to it, very. For me it's more about understanding the universe and where that takes us as humans. The salient point there is that information is a physical quantity. The ramifications of that are pretty intense. I would love to tell you about it at length, lol. ;D Maybe there will be a blog series. Or a skype science club meeting when it's not midnight. What are you doing with a bioinformatics degree that has you reading abstracts?
I didn't quite mean it like that. Just that some people design their life around what they want to be. Anyway, Kurszweil is a very interesting read because he is quite logical and clear about what is currently happening. He then takes things to their logical conclusion. Nothing more. His premise is that with the continued exponential increase in computing power and developments in AI we will eventually create a machine that passes the turing test. He predicts somewhere around 2040 based on current rates of development.
Haven't read much about information as a physical entity, I suppose that makes sense in that information can be seen as the transfer of states. If all information that I understand is captured in the structure of my brain then it must have a physical structure? Interesting...
My job is way off base. I work for a pharma company looking after our preclinical research projects. Basically get to read scientists abstracts before they go to print. Pretty awesome in that respect.
Blog series sounds like a plan. It is kinda hard to organise stuff being in Oz.
|
On July 15 2011 14:03 Probulous wrote:Show nested quote +On July 15 2011 13:45 EatThePath wrote:On July 15 2011 13:34 Probulous wrote:On July 15 2011 13:28 EatThePath wrote:On July 15 2011 13:21 Probulous wrote:On July 15 2011 13:10 EatThePath wrote:On July 15 2011 12:38 Probulous wrote:On July 15 2011 12:24 EatThePath wrote:On July 15 2011 11:39 Probulous wrote:On July 15 2011 11:19 EatThePath wrote: Thanks guys for the comments! Keep 'em coming if any more people happen to see this. ^^
I totally agree with all of you about the intro content. Yuck. This abstract won't actually be used for a paper, it's meant to introduce / summarize a poster (like you see in the halls at a lab building) that will be presented to other students/researchers of disparate backgrounds.
Personally I think it's pretty stupid but that's why I have all that intro junk in there, because the abstract is supposed to stand on its own in providing context and significance.
I'm glad it comes across okay. I'll definitely revise the second section to provide more "going forward from these results" comments.
Your input is invaluable folks. ;D Well in that case you will need some introduction stuff. just a couple of sentences outlining what the background of DNA typing is. You want people to read your content, not the background. Are you giving a presentation with this or it will stand on its own? I'm not entirely sure what the people running the program have in mind. I know that I'm giving a presentation to other people here with me (30-40 other interns at this school) where we share our research with each other but.. I really couldn't care less about that. I think that's mostly for practice, eh. They so far have "encouraged us strongly" to submit our research to student conferences that pander to this sort of thing. (Yay, more padding for graduate school aps). And I understand that many of these conferences weed you out initially based on abstract alone. Perhaps they also want to use the abstracts for some sort of review / wider publication survey of research kind of thing. That's as much as I know about that. To complicate matters, in my own personal case I'm also helping to write the actual paper, which will have different verbage than this and my poster. So I might be overemphasizing the differences between "the real one" and my "sharing one". Also, to make the point of the study clear, I have to make this clear: While validation of the most commonly used software is extensive, it focuses on parameter settings that reproduce correct results, but does not otherwise account for the underlying signal processing. And in order for that to mean anything to a wider audience I have to explain about the software a bit. I don't want to bother you guys with the nitty gritty reasoning of revising this to death. I'll go into it more if you want. Well this complicates things I guess. Still I would suggest using the paper abstract as your base and the simplify for the student audience. If these conferences are going to be used for your post-grad entry you want to be seen as scientific first and foremost. You're pretty luck to be an author on a paper so early. That's a big bonus. As for the specifics /details. My degree was in bioinformatics so this is right up my alley When you say signal processing, you mean the translation of the raw signals into base pairs? If so, interesting area of comparison, I guess this is a neglected area of validation. Using the peak heights is interesting too. Were any of your difference significant enough to warrant a rethink of the software itself. I mean if you are getting different sequences with different software packages, how do you interpet your results... Oh hai you know what I'm talking about. ;D Yes I'm super excited to be an author! I'll be revising with my mentor / graduate student who is helping me too, so I think with all this assistance I can strike the right balance. I am thinking now to err on the side of straight and to-the-point. So... I am dealing with STR profiling where you essentially count up the repeats to by measuring fragment length and assign an allele number. To do this you run a size standard to sync up BP size to time in the run, and you also run a ladder which has every allele to provide you the software with "bins" so it knows what allele number to call based on where the peak shows up. The peak heights don't matter at all most of the time. If they are super high it can cause artifacts and if they are unexpectedly low you probably had a problem during your PCR or whatnot. Also, if you have low copy to start with or degraded DNA (like resumed remains) you'll have low peaks cause you just don't have that much to start with so it's low signal. So you get situations where your peaks are too low and they don't meet the detection threshold. In this case you throw out that locus as inconclusive which gives you a partial profile. In analogy, you have a partial fingerprint. Almost everyone in the US uses the same piece of software from the company whose instruments almost everyone uses too. But there are new programs entering the market and they are not being looked at a whole lot yet. So my study found that different programs have slightly higher or lower peak heights overall, which is the result of slightly different approaches in the initial raw data processing. They all give you the same profile under normal circumstances because that just depends on the sizing the peaks, aka fragment length (not height Y, just X position). When you have low peaks (for whatever reason) you can have a situation where in one program, it is just under the threshold so you don't include that locus, and in another program it meets the threshold and you include the locus. So you have differing completeness of profiles which are concordant, but one has more information. A given locus can represent discriminatory power (likelihood of matching a random human) of like 1/1000. If you gain or lose two loci your odds you tell the jury in court can differ by a million-fold. That's a dramatic way to put it but it illustrates the point--you want concordance in your profiling. We tested the solution of just adjusting the threshold based on our regressions. Which works pretty well, it helps a lot. But at a certain level your signal processing is just not the same and there's no way around that. Big picture, it's not a huge deal but it's something that people should think about. Wow I gassed on. Hope you find it interesting. Yeah that was what I expected. One or two loci isn't going to break the bank unless you are really low on your initial sample, in which case there are other sources of error. It is an interesting area to look at, simply because most people just assume that what they read is correct. Well done on the work. Good luck with the paper, any ideas what you want to do after this? Edit: Oh and you will soon learn that on TL, there is always someone who knows what you are talking about. Doesn't matter the subject. There are some serious brains on this site And no I don't consider myself a brain. Exactly, people just go with the program. To be fair the demand for profiling (not just in a criminal forensic application) is high and growing, so there will be lots of people who are basically technicians. In that kind of world we should make sure all the automated stuff works properly first. And in my experience the forensic community is utterly dedicated to that, even if they are pulled and pressured from different angles. Like state mandated caseload processing rates. What do I want to do? I'm going to make artificial intelligences like Cortana from Halo. edit. lol no worries I barely consider myself a TLer. I have seen some impressive folks though, did you see that thread "what is your occupation / aspiration?" I was blown away. Yup, makes my career seems really mundane. Oh well, my job was never going to define me. You are certainly reaching for the stars. I don't play halo, so I am afraid that reference just flew by me. Gvien your direction I would imagine you've read a bit of Kurzweil's theory on the singularity. You want to be a part of developing that? It is pretty mind-bending, not sure I could give up my inferior fleshiness. Nah man, there's every chance I'll end up "not being defined by my career" but that's what I'm shooting for. Finally. After 8 years without a university degree I find my calling the last few months. I haven't heard of Kurzweil before but a quick googling shows me things I can jive with. I just finished reading "The User Illusion" by Tor Norretranders which synthesises basically everything I hold dear that I've come to know about math science and the universe. It's a thesis about consciousness, but it's based on physics, ultimately. I don't go in for the sci fi motivation "lets build the future" kind of mentality even though I'm partial to it, very. For me it's more about understanding the universe and where that takes us as humans. The salient point there is that information is a physical quantity. The ramifications of that are pretty intense. I would love to tell you about it at length, lol. ;D Maybe there will be a blog series. Or a skype science club meeting when it's not midnight. What are you doing with a bioinformatics degree that has you reading abstracts? I didn't quite mean it like that. Just that some people design their life around what they want to be. Anyway, Kurszweil is a very interesting read because he is quite logical and clear about what is currently happening. He then takes things to their logical conclusion. Nothing more. His premise is that with the continued exponential increase in computing power and developments in AI we will eventually create a machine that passes the turing test. He predicts somewhere around 2040 based on current rates of development. Haven't read much about information as a physical entity, I suppose that makes sense in that information can be seen as the transfer of states. If all information that I understand is captured in the structure of my brain then it must have a physical structure? Interesting... My job is way off base. I work for a pharma company looking after our preclinical research projects. Basically get to read scientists abstracts before they go to print. Pretty awesome in that respect. Blog series sounds like a plan. It is kinda hard to organise stuff being in Oz.
Sounds like a good setup for keeping the brain engaged and fending off boredom. Maybe I'm naive but I'm always stunned when I hear about niches like yours, "you get payed for that??".
I just read all up and down the wikipedia page. So did you read the book? I've given a lot of thought to the general social problem of increasing rates of technological development. The singularity notion is new to me and it's cool. I need to read more before I decide for sure to debate it. The part I bolded, I have beef with that. For next time.
Cheers, thanks again for the tips, I really appreciate it. ^^
|
No Sweat mate, PM next time you post a blog. I am not on all the time
|
|
|
|