How do you decide what code to put into a function?

Question

I started out with a script that was a few hundred lines. Later, I realized I wanted another script that would require much of the same code. I decided to wrap certain areas of the original script that would be shared, into definitions. When I was deciding exactly what should be in a function, I came across various things to consider:

What should I set as its input parameters? If anything was needed in the function, I should probably require it as a parameter, right? Or should I declare it within the function itself?
If function x always requires the output of function y, but function y sometimes is needed alone, should function X include the code of function y or simply call function y?
Imagine a case where I have 5 functions that all call a function 'sub' (sub is essential for these 5 functions to complete their work). If 'sub' always is supposed to return the same result, then wouldn't multiple calls from these parent functions duplicate the same work? If I move the call to 'sub' outside of the 5 functions, how can I be sure that 'sub' is called before the first call to any of the 5 functions?
If I have a segment of code that always produces the same result, and isn't required more than once in the same application, I normally wouldn't put it in a function. However, if it is not a function, but is later required in another application, then should it become a function?

Sorry if these questions are too vague, but I feel there should be some general guidelines. I haven't programmed for very long, and bounced around between OOP and functional, but I've never remembered reading anything that explained this. Could it simply be a matter of personal preference?

Doc Brown · Answer 1 · 2022-07-19T12:11:27.033

You are thinking about functions only to be a container of code beeing reused somewhere. That's not the worst start when learning how to build functions, but it is not necessarily the only approach or the "best" way.

A different point of view is to use functions for creating abstractions. You wrote you have a script with a few 100 lines - fine. Now you have blocks within this script doing a certain subtask. Each of these blocks belongs into a function of its own. The name of the function should be self-explanatory telling you what that subtask is. When you avoid global variables and side-effects, it becomes self-evident what parameters your functions will need and what they must return.

This way, your functions are becoming the building blocks of your application. This is mostly independent of how often they are reused, and independent of how often they are called.

If function x always requires the output of function y, but function y sometimes is needed alone, should function X include the code of function y or simply call function y?

y is a single subtask within x, something which for which you can up easily with a separate name. Hence it should clearly be in its own function - independently of being needed alone or not!

Imagine a case where I have 5 functions that all call a function 'sub' (sub is essential for these 5 functions to complete their work). If 'sub' always is supposed to return the same result, then wouldn't multiple calls from these parent functions duplicate the same work? If I move the call to 'sub' outside of the 5 functions, how can I be sure that 'sub' is called before the first call to any of the 5 functions?

First, if there is the same thing calculated twice, does that really matter in your case? There is no need to optimize that away as long as you have not a proven performance bottleneck. Second, if your really have a performance bottleneck in such a case, there is the technique of memoization, which will exactly solve this problem.

If I have a segment of code that always produces the same result, and isn't required more than once in the same application, I normally wouldn't put it in a function.

That is exactly what you should do differently - put it into a function, not for the purpose of reuse, but for the purpose of creating an abstraction.

score 2 · Answer 2 · answered May 31 '13 at 20:00

In Clean Code, Robert Martin argues that fewer parameters are better. Consider whether something should be an instance variable instead of a parameter. Also, see Preserve Whole Object in Refactoring by Martin Fowler.
Call function y. Don't Repeat Yourself (DRY) as recommended in Pragmatic Programmer (Hunt and Thomas)
I'm not sure what you mean by "always returns the same result". Do you mean that the same result is returned for the same input parameters (deterministic)? Are you asking about lazy loading?
Before I read Clean Code, I was reluctant to create small functions that were used only once. Now I do so frequently. Consider extraction a function if you have a long function with several blocks that represent sub steps in the function. The tell-tale sign is a comment that says something like "Calculates Monthly Sales". You should probably extract the code block into a function called calculateMonthlySales.

score 2 · Accepted Answer · answered May 31 '13 at 20:01

there are 2 ways to pass data into a function: parameters or globals, in small scripts globals are acceptable but really try to avoid them
it's easier to simply extract y this is also better when you need to change y later

you can use lazy initialization here:

var subresult=undefined

function sub(){
    if(subresult===undefined){
        subresult=//calculate...
    }
    return subresult;
}

there are 2 competing principles here SRP and YAGNI:

A. Single Responsibility Principle means essentially that each function should do a single thing and do it correctly

B. You Aren't Going to Need It: don't waste time on stuff that may or may not be needed in the future, focus on what you need now

score 1 · Answer 4 · answered May 31 '13 at 19:57

There are no recipes for logic, but I will answer your specific questions:

Q: What should I set as its input parameters?

A: the things needed to do the calculation

Q: If anything was needed in the function, I should probably require it as a parameter, right?

A: yes, or you can get it from a file or database

Q: If function x always requires the output of function y, but function y sometimes is needed alone, should function X include the code of function y or simply call function y?

A: simply call y

Q: Imagine a case where I have 5 functions that all call a function 'sub' (sub is essential for these 5 functions to complete their work). If 'sub' always is supposed to return the same result, then wouldn't multiple calls from these parent functions duplicate the same work?

A: if sub always returns the same result, then no work is beeing done.

Q: If I move 'sub' outside the 5 functions, how can I be sure that 'sub' is called before the first call to any of the 5 functions?

A: wasn't sub already outside the 5 functions ?... Do not call sub from within any of those functions. Call it in the parent program before calling the 5 functions.

Q: If I have a segment of code that always produces the same result, and isn't required more than once in the same application, I normally wouldn't put it in a function. However, if it is not a function, but is later required in another application, then should it become a function?

A:If it is complex enough. Put it into a function. If other app needs it, share the library.

How do you decide what code to put into a function?

4 Answers4

Linked